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ABSTRACT 

PROBLEM TO BE SOLVED: To provide a retrieval method and device for 
resolving heterogeneousness of plural data bases , which resolve the 
heterogeneousness of plural data bases different even in at least a 
part of data structures, DBMS (data base management system), and data 
expression without designation of correspondence relations or conversion 
rules for every retrieval request to retrieve data from plural data 

bases , and a record ing medium where a retrieval program for resolving 
the heterogeneousness of plural data bases is recorded. 

SOLUTION: Data expression formats, conversion functions between data 
expression formats, etc., are stored and managed in an information resource 
dictionary 7a and a conversion function library 9, and these data 
expression formats and conversion functions between the expression formats 
are used so that an inquiry conversion part 5 converts the expression 
format of data of a retrieval request to that of data used in a retrieval 
object data base 21 to generate a retrieval request statement, and a data 
base communication control part 11 uses this retrieval request statement 
to retrieve the retrieval object data base 21. 
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ABSTRACT 

PROBLEM TO BE SOLVED: To perform standardized translation fast with simple 
constitution by a machine translation system which makes use of an example 
data base. 



SOLUTION: In the example data base 8 many translation sentences are 
stored and identical or similar sentences are extracted for translation. 



When no similar sentence is found, the original text is divided and 
identical or similar examples are extracted for the divided sentences for 
translation. When no similar sentence is found for a divided sentence, 
translation is carried out by words or idioms by performing retrieval from 
a dictionary data base 9. The translation result is postedited by 
referring to the example data base 8 and dictionary data base 9. The 
completed translation sentences are stored in the example data base 8 
together with the original text. Each time translation is performed, 
example data are stored, so the example data base is automatically 
enriched. The standardized translation is performed fast only by using 
simple grammatical rules 3, 5, and 6, the example data base 8, and the 
dictionary data base 9. 
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ABSTRACT 

PROBLEM TO BE SOLVED: To retrieve data among plural data bases in 
relation without defining the relation of tables among the plural data 

bases beforehand by using a dictionary for registering the type of the 
value of a data item and the range of the value, detecting the relation of 
the table and retrieving the data of the plural data bases in 
relation by using the detected relation. 
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operator reports a table name to be accessed and the 
request of the table capable of being related to the table 
formation inside the dictionary 1 through an operator coping 
operator coping part 2 informs a relation detection part 3 of 
elation detection part 3 extracts the table provided with the 
f the type of the same value as the data item present in the 
e specified table name and the candidates of the table capable 
ated are reported to a relation judgement part 4. The relation 
lap of the range of the value is judged by the relation 
rt 4, the overlapping range is detected and the relation 
s reported to the operator coping part 2 . 
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ABSTRACT 

PROBLEM TO BE SOLVED: To facilitate information retrieval corresponding to 
plural different data base systems connected to a network. 

SOLUTION: Referring information for accessing to plural data base 
system is stored in advance and at the time of obtaining an information 
retrieving request (S101), this information retrieving request is 
analyzed (S102) to specify which data base system information corresponding 
to the retrieving request is stored, by referring to reference 
information. In addition, information on how to obtain information is 
obtained (S103 and S104), an information retrieval instructing sentence is 
generated based on information showing the storing position and the 
obtaining method corresponding to the information retrieving request 
(S105), a retrieving request is given to a data base system 
corresponding to the information retrieving request (106) and the 
retrieving result is provided for a user (S107) . 
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ABSTRACT 

PROBLEM TO BE SOLVED: To perform retrieval from multiple data bases 
on the basis of an array of words without paying attention to a schemer by 
generating a retrieval command according to a selected concept dictionary 
and performing retrieval from a data base according to the generated 
retrieval command. 



SOLUTION: A concept dictionary selection part 11 determines a concept 
dictionary to be retrieved according to a vocabulary that is being 
inquired. A relation determination part 12 determines the relation between 
words according to the concept dictionary of the selected data base. A 
retrieval command generation part 13 converts the array of vocabularies 
into a retrieval command according to the relation between the words. This 
retrieval command is executed by a retrieval command execution part 14. A 
retrieval command execution part 14 requests a data base management 
system which manages element data bases to execute the command as the 
conversion result, thereby performing retrieval. Its retrieval result is 



displayed at a retrieval result display part 15. 



26/5/9 (Item 9 from file: 347) 

DIALOG (R) File 34 7:JAPIO 

(c) 2003 JPO & JAPIO. All rts. reserv. 

05235078 **Image available** 
DESIGN SUPPORT DEVICE 

PUB. NO. : 08-190578 [JP 8190578 A] 

PUBLISHED: July 23, 1996 (19960723) 
INVENTOR (s) : HAGA NORIYUKI 

AKASAKA SHINGO 

ARAI YOSHIHISA 

SHIBATA NOBORU 

APPLICANT (s) : HITACHI LTD [000510] (A Japanese Company or Corporation), JP 
( Japan) 

APPL. NO. : 07-000021 [JP 9521] 

FILED: January 04, 1995 (19950104) 

INTL CLASS: [6] G06F-017/50 ; G06F-017/30 

JAPIO CLASS: 45.4 (INFORMATION PROCESSING — Computer Applications); 26.9 

(TRANSPORTATION — Other) 
JAPIO KEYWORD: R060 (MACHINERY — Automatic Design) 

ABSTRACT 

PURPOSE: To provide a design support device which evaluates the similarity 
of shape to present the result even in the case that shape features of a 
product are included in requested specifications at the time of 
similarity retrieval of design examples. 

CONSTITUTION: In this design support device, a design example data base 

30 where plural design examples of a drawing or the like characterized 
by features of the shape or the like of the product are preliminarily 
stored, a shape feature input means 10 which can input the shape features 
of requested specifications, a shape feature evaluation means 20 which 
compares inputted shape features with shape features of examples 
preliminarily registered in the design example data base 30 to calculate 
the degrees or similarity, and a retrieval result display means 70 which 

ranks examples in the order of degrees of similarity and displays them 
are provided to generate the design result meeting the requested 

specifications or information useful for design of requested 

specifications . 
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ABSTRACT 

PURPOSE: To improve the operability and to very easily acquire required 
information. 



CONSTITUTION: A database 1 is built up by taking facilities especially into 
account. The database 1 have data of facilities to be managed such as 
drawings and various documents . The database 1 is linked to a facility 
information expression section 3 being a retrieval object via an 
information retrieval interface 2. The information retrieval interface 2 
uses drawings expressing a facility totally. The facility information 
expression section 3 expresses various information relating to the 
facility itself. A facility point-out section 4 points out a desired 
facility based on tne facility information and a 1st processing section 5 
ranks the facility information according to an object of information 
retrieval. The 1st processing section 5 reserves a display area according 
to the ranking of the facility information and processes the vertical 
relation of display patterns. After the facility information is displayed, 
a 2nd processing section 6 designates the relation of drawings and 
documents in the information to acquire new information. 
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Affinity based similarity search method, involves generating inverted 
list of document identifiers for each term in lexicon and evaluating 
search query using affinity lists and inverted lists 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 

Inventor: AGGARWAL C C; YU P S 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 
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Abstract (Basic) : US 6587848 Bl 

NOVELTY - The method involves generating an affinity list for each 
term in a lexicon of terms of documents to be used in the similarity 
search. An inverted list of document identifiers is generated for each 
term in the lexicon using iterative techniques on the affinity list. 
The search query is evaluated using the affinity and lists list where 
the query originates from the user at the client device. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following : 

(a) an apparatus for performing an affinity based similarity search 

(b) an article of manufacture for performing affinity based 
similarity search, comprising a machine readable medium containing 
programs which are executed. 

USE - Used for performing similarity searches in documents. 

ADVANTAGE - The quality of the results from search engine are less 
sensitive to the choice of search terms. The method takes in to account 
words, which are not included in the query that increases the 
specificity and effectiveness of the query . 

DESCRIPTION OF DRAWING (S) - The drawing shows a flow diagram 
illustrating the process of rank ordering documents based on a 
similarity value. 
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Abstract (Basic): US 20020120618 Al 

NOVELTY - An expansion unit (9) refers to information about 
predicates used in query processing and the degree of the connections 
of the predicates that are stored in a predicate dictionary (4) for 
converting a query which is input into the integrated database system 
(1), into several query sets. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is included for 
recorded medium storing query processing program. 

USE - For query processing e.g. for DNA sequence analysis 
application . 

ADVANTAGE - By referring to the information about the predicates 
for converting the query input into the database system into 
several query sets, the cost of the query processing is minimized 
and accurate query results are obtained. 

DESCRIPTION OF DRAWING (S) - The figure shows the block diagram of 
the integrated database system. 

Integrated database system (1) 

Predicate dictionary (4) 

Expansion unit (9) 
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Abstract (Basic): JP 2002197188 A 

NOVELTY - The monitoring system instantaneously judges the 
information such as i(ks medication 1 s influence effect, group of 
patient for whom it iV/^plicable, its adverse effect on mishandling, 
etc., related to an inputSnedi cation Prescription, by searching 
several databases . A di-^lay scree| displays the judged information 
as a list screen. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for the 
following: \ / 

(1) Medication usage monitoring ^rnethod; and 

(2) Machine readable recording, medium storing program for 
medication usage monitoring method: 

USE - Used for patients such a^\>regnant women, lactating women, 
diseased patients, etc. 

ADVANTAGE - The content of p/escri\ed medication is confirmed 
correctly before using. 

DESCRIPTION OF DRAWING (S) 
screen. (Drawing includes non- 
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Abstract (Basic) : US 20020038308 Al 

NOVELTY - Several data elements that correspond to at least one 
data element of several databases (108-114), are stored in a global 

dictionary system. The relationships between the two or more data 
elements are identified. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following : 

(a) Data retrieval method; 

(b) Data retrieval system; 



(c) Global data dictionary system 

USE - For creating virtual data warehouse employing several 
databases • 

ADVANTAGE - Facilitates syntactic and semantic integration of 
several databases into a single logical entity, which is accessible 
through one global data dictionary . Hence allows users to conduct 
expansive searches or queries / regardless of the database 
management system. 

DESCRIPTION OF DRAWING (S) - The figure shows the structure of 
database system. 

Databases (108-114) 
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Abstract (Basic) : KR 2001086665 A 

NOVELTY - A system and method for semiconductor electronic commerce 
are provided to synthetically offer information by connecting a parts 
information DB, an enterprise information DB, a transaction information 
DB, a newspaper information DB, a dictionary information DB, a job 
offer/hunting DB, a new product information DB and so on. 

DETAILED DESCRIPTION - A semiconductor electronic commerce system 
is composed of a server (100), various databases (110-160), a parts 
supply enterprise (330) , a client (340), and a job of f erer/seeker ( 350) . 
An enterprise information DB(160) stores enterprise information. A 
parts information DB(150) stores parts information. A transaction 
information DB(140) stores transaction information. A newspaper 
information DB(110) stores newspaper information. A dictionary 
information DB(170) stores dictionary information. A job 
offer/hunting DB(120) stores job offer/hunting information. A new 
product information DB(130) stores new product information. In the case 
of searching for enterprise information ( 310 ) , the client (340) 
extracts enterprise information from the enterprise information 
DB(160). In the case of searching for dictionary information ( 320 ) , 
the client (340) extracts dictionary information from the dictionary 
information DB(170). The parts supply enterprise { 330) extracts parts 
information from the parts information DB(150) . The job 

of ferer/seeker (350) extracts job offer/hunting information from the job 

offer/hunting DB(120). 
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Abstract (Basic) : WO 200190953 A2 

NOVELTY - A processor (202) receives a natural language query and 
plural database objects (204) and produces a query result, such 
as information relevant to the combination of the query and the 
objects. The query is mapped to the objects using a reference 
dictionary (208) and a reference dictionary object identifier (205) 
parses the queries and generates one or more objects recognized in 
the dictionary . The processor determines the optimal interpretations 
of the received objects and a mapping processor (206) performs mapping 
between incoming objects and database objects, to generate database 
queries , while a keyword (209) points to semantic objects, pointing 
ultimately to the database object values. 

DETAILED DESCRIPTION - AN INDEPENDENT CLAIM is included for a 
method for processing a natural language input. 

USE - Recognizing natural language via a user interface. 

DESCRIPTION OF DRAWING (S) - The drawing shows a natural language 
query processor 

Processor (202) 

Database objects (204) 

Reference database (208) 

Identifier (205) 

Mapping processor (306) 

Keyword (209) 
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Search system used in Internet applications, extracts search objective 
database based on stored search log indicating search condition of 
various databases 

Patent Assignee: NEC CORP (NIDE ) 

Number of Countries: 001 Number of Patents: 002 

Patent Family: 
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JP 2001273297 A 20 G06F-017/30 Div ex application JP 97315099 

JP 3248530 B2 19 G06F-017/30 Div ex application JP 97315099 

Previous Publ . patent JP 2001273297 

Abstract (Basic) : JP 2001273297 A 

NOVELTY - An operation log acquisition section (110) stores 
search log indicating search condition of various databases . A 
preference database extraction section (130) extracts search 
objective database, and a search device (300) displays the search 
objective database sequentially, based on the search conditions. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
recording medium. 

USE - For searching database in Internet applications. 

ADVANTAGE - Searches suitable database efficiently within a 
short time. 

DESCRIPTION OF DRAWING (S) - The figure shows the block diagram of 
the search system. (Drawing includes non-English language text) . 
Operation log acquisition section (110) 
Preference database extraction section (130) 
Search device (300) 
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system 
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Abstract (Basic) : WO 200161571 A2 

NOVELTY - Data items stored in a database (15) has several 
associated attributes, which are logically linked to stored values and 
to weight for associated attributes. A search system (20). when 
input with a data item, identifies another data item which has 
attribute, values and associated weights similar with the input data 
items . 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
database management method. 

USE - For searching and retrieving data from computer database 
e.g. in Internet. 

ADVANTAGE - Linking data attributes with relevant weightings 
increase the precision of search results and the usefulness of orders 
in which search results are presented. 

DESCRIPTION OF DRAWING ( S ) - The figure shows the block diagram of 
attribute tagging and matching system of database management system. 

Database (15) 
Search system (20) 

pp; 33 DwgNo 1/13 

Title Terms: DATABASE; MANAGEMENT; SYSTEM; IDENTIFY; DATA; ITEM; ATTRIBUTE; 

VALUE; WEIGHT ; SIMILAR; DATA; ITEM; INPUT; SEARCH ; SYSTEM 
Derwent Class: T01 

International Patent Class (Main) : G06F-007/00 ; G06F-017/30 
File Segment: EPI 



26/5/28 (Item 18 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2003 Thomson Derwent. All rts . reserv. 



013980512 **Image available** 

WPI Acc No: 2001-464726/200150 

XRPX Acc No: N01-344721 

Computing apparatus for collaboratively searching several knowledge 
databases formed by a combination of databases from a global computer 
network uses a query searcher for conducting search queries of 
content of knowledge database 

Patent Assignee: ZENTECH INC (ZENT-N) ; ZEN TECH INC (ZENT-N) 

Inventor: DELANO P A; DELANO P 

Number of Countries: 088 Number of Patents: 003 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 200109747 A2 20010208 WO 2000US20288 A 20000726 200150 B 
AU 200063759 A 20010219 AU 200063759 A 20000726 200150. 
US 6430558 Bl 20020806 US 99365927 A 19990802 200254 



Priority Applications (No Type Date) : US 99365927 A 19990802 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 200109747 A2 E 27 G06F-017/00 

Designated States (National) : AE AL AM AT AU AZ BA BB BG BR BY CA CH CN 

CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ 

LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK 

SL TJ TM TR TT UA UG US UZ VN YU ZA ZW 

Designated States (Regional) : AT BE CH CY DE DK EA ES FI FR GB GH GM GR 

IE IT KE LS LU MC MW MZ NL OA PT SD SE SL SZ TZ UG ZW 

AU 200063759 A G06F-017/00 Based on patent WO 200109747 
US 6430558 Bl G06F-017/30 

Abstract (Basic) : WO 200109747 A2 

NOVELTY - Apparatus uses query searcher to conduct (20) search 
queries of content of knowledge database. The search results 
ranker responds to the searcher to provide ranked content search 
results of the relative closeness of a requested query inputted by 
a user when conducting a search through the use of several 
client-user computer interfaces. Results updater continuously updates 
content search results. 

DETAILED DESCRIPTION - Independent claims describe an apparatus and 
a method for collaboratively searching knowledge databases and a 
collaborative searching engine. 

USE - As an apparatus and method for collaboratively searching 
several knowledge databases formed b a combination of databases from 
a global computer network. 

ADVANTAGE - Advantageously provides an apparatus and methods for 
collaboratively searching knowledge databases such as those provided 
by a global computer network and substantially increases access to 
other related information with the knowledge databases. 

DESCRIPTION OF DRAWING ( S ) - The drawing shows a scematic block 
diagram of an apparatus for collaboratively searching knowledge 
databases . 

the collaborative search engine (20) 
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Abstract (Basic}: JP 2001084252 A 

NOVELTY - An analysis unit analyzes the structure of input search 
sentence, with reference to a word dictionary . An index data 
generator generates information used as an index at the time of 
searching a database storing several documents. The document which 
is most similar to that of the input search sentence, is extracted 
from the database. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following: 

(a) Document; 

<b) Search procedure; 

(c) Recording medium 

USE - Similar document- search system. 

ADVANTAGE - Enables accurately searching the required document 
from the database. 

DESCRIPTION OF DRAWING (S) - The figure shows the block diagram of 
similar document search system. (Drawing includes non-English 
language text) . 
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Internet site searching and listing system includes server search 
program to search site listings database in response to search 
inquiries by taking into account new denominated bid value entered by 
subscriber 

Patent Assignee: SEARCHUP INC (SEAR-N) 

Inventor: BUCK B J; MELCHER M 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 
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Abstract (Basic) : US 6078866 A 

NOVELTY - A server search program searches a site listings 
database with titles of content that match the given search inquiry 
from a user. The search program searches the site listing's database 
in response to search inquiries from users by automatically taking 
into account the new denominated value bid entered by the subscriber 
for subscriber's site listing. 

DETAILED DESCRIPTION - A listing server is connected to the 
Internet which is accessible by several users. The site listing 
database contains several site listings, each of which is provided 
by a site listing subscriber. The database includes a title or 
description of the content of the respective site, a network address at 
which site is accessed on the Internet, and a denominated value bid by 
the subscriber for the site listing while it is maintained on the 
listing server. The server search program searches the site 



listings database having titles or descriptions of content that match a 
given search inquiry from the user and for ordering the site listings 
found search in order of the denominated values. The listing server 
provides the search report of the denominated-value-ordered site 
listings relevant to the search inquiry to the user in order 
according to the denominated values bid by the subscribers for the 
found site listings. A bid management program includes a subscriber 
account interface for allowing a subscriber to connect online with the 
listing server and to automatically enter a new denominator value bid 
for subscriber's site listing into site listings database. An 
INDEPENDENT CLAIM is also included for Internet site searching and 
listing method. 

USE - Internet site searching and listing system which is based 
on ranking of site listings based on monetary value. 

ADVANTAGE - The subscriber for a web site has the opportunity to 
determine in competitive monetary terms where their site appears in 
search result. This eliminates the use of arbitrary factors to compute 
a relevancy ranking or a subjective determination of value by the 
search service. The subscriber is allowed for direct control of the 
site listing. The freedom to make spontaneous modifications to the 
search rankings provides the subscriber with more rational and 
responsive search service. 

DESCRIPTION OF DRAWING ( S ) - The figure shows the diagram 
illustrating denominated value search service using credit point 
total to set the rankings of search listings. 
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Abstract (Basic) : EP 1003111 Al 

NOVELTY - The method has a function to change over between several 
document databases (131,141), and a function to search a set of 
documents having a high relevance to search input from a selected 
document database in the order of higher relevance. The input may be a 
set of keywords, fragments of a document or any desired set of 
documents. The search results from the document database (131) can be 
used as input for searching another database (141) . 

DETAILED DESCRIPTION - Using the search module (143), a server 
(14) calculates the relevance of the summary of the set of key 
documents sent from the client to the target document database (141), 
and returns document identifiers of high relevance to a client (11) 
with a relevance weighting . 

An INDEPENDENT CLAIM is included for: 

(a) a service for searching documents 

USE - In a document searching method for changing over between 
several document databases , and constructing relationships between 
these document databases. 

ADVANTAGE - Allows a user to specify an arbitrary set of documents 
in an arbitrary document database, and to efficiently search sets 
of documents relating to this set of documents from within any 
particular database. 

DESCRIPTION OF DRAWING (S) - The drawing shows an example of the 
overall construction of a system implementing multiple document 
database search method. 

client (11) 

server (14) 

document databases (131,141) 
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Web site searching and indexing system for Internet, provides search 
report of listings relevant to search inquiry in which rank is 
assigned in order according to denominated values associated with 
listings 
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Abstract (Basic) : WO 200016218 Al 

NOVELTY - A listing server connected to a network accessible by 
several users, provides search report of listings relevant to the 
search inquiry from user. The listings are assigned a rank in order, 
according to the denominated values associated with the listings. The 
denominated value is subscription fee of initially entered amount that 
may be adjusted during defined adjustment period. 

DETAILED DESCRIPTION - A listing server connected to network 
accessible by several users, has several site listings databases . 
Each database provided by site listing subscriber, has title or 
description of content of the respective site, network address at which 
the site can be accessed by the network and denominated value to be 
paid by the subscriber associated with site listing. The server has 
account interface that allows subscriber to enter information to set 
subscription fee for respective listing in order to obtain desired 
rank for listing. The server has search unit that conducts category 
or index search of the site listings database based upon selected 
category or keywords provided with search inquiry from user. An 
INDEPENDENT CLAIM is also included for method of web site searching 
and listing. 

USE - For searching web site in Internet based on monetary 
ranking . 

ADVANTAGE - The web site owners can determine for themselves the 
rankings that their information or services should receive in 
competition with others and not through computation of ranking based 
on arbitrary factors or subjective determination by search service. 
Also the web site owners are able to readily upgrade or downgrade their 

rankings based upon their assessment of market factors on on-going 
basis, using the indexing and searching system. The system can be 
readily implemented at manageable cost and readily understood by users 
without having to accept a new search orthodoxy or unfamiliar change 
of search usage. 

DESCRIPTION OF DRAWING ( S ) - The figure shows user interface used 
with denominated value search service. 
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Image search method for personal computer, involves accessing various 
image database based on respective converted search conditions, using 
common keyword 
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Abstract (Basic) : JP 2000076287 A 

NOVELTY - The weighting of characteristic vector classification 
of image which is search object, is performed based on keyword for 
image search . The conversion of search conditions is performed to 
access the image data. Image database is accessed based on the 
converted search condition. Several such image database are 
searched using respective search conditions and a common keyword. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following: 

(a) image search apparatus; 

(b) image search program 
USE - For personal computer. 

ADVANTAGE - Image search is performed in image database with 
different search conditions with a common keyword, therefore 
difference of search conditions in each database management system is 
eliminated . 

DESCRIPTION OF DRAWING (S) - The figure shows the flow chart of 
image search method, 
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Internet document search apparatus - has several database 
management system to distribute universal resource locator information of 
document on internet 
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Abstract (Basic) : JP 11282870 A 

NOVELTY - A search robot (10a) collects the data from document 
and registers it on several database management systems (DBMS) 
model (lOh) . The DBMS distributes the universal resource locator (URL) 
information of the document on internet to the list of ordered 
vocabulary in the document collected by search robot without 
overlapping. DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also 
included for control method of internet document search apparatus. 

USE - For searching document in internet. 

ADVANTAGE - Since the system implements by the coordination between 
several DBMS, the access frequency is reduced. The URL information is 
arranged in order automatically and can be accessed depending on a 
content line. DESCRIPTION OF DRAWING (S) - The figure shows the block 
diagram of internet search apparatus. (10a) Search robot; (lOh) 
DBMS model. 
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Abstract (Basic) : US 5983220 A 

NOVELTY - A proximity searcher user interface is coupled to 
evaluation engine for displaying reference item from the database 
(2.4). The searcher user interface also displays nearest neighbor 
item for attribute as a function of distance between reference item and 
nearest neighbor item, for at least one attribute of domain model 
(2.10) . 

DETAILED DESCRIPTION - An evaluation engine couples domain model to 
the database, and provides a user interface (2.16) for allowing user to 
iteratively set criterion for selecting and displaying a set of 
matching items comprising a short list. The evaluation engine allows 
user to inspect, compare or navigate the items on short list. A scoring 
interface displays relative score of each item from short list. A 
direct manipulator performs weighting of relative weight of 
attribute of item. The evaluation engine redetermines relative score of 
each item in short list according to any change in relative weighting 
of attributes . 

USE - For helping consumers and business users to find items in 
computer database that most closely matches their objective 
requirements and subjective preferences in network environment. 

ADVANTAGE - Supports analysis and evaluation of similarity of items 
in database with respect to multiple criteria, hence database of 
information rich items can be turned into an interactive buyer's guide. 

DESCRIPTION OF DRAWING (S) - The figure shows software component of 
database evaluation system. 

Database (2.4) 

Domain model (2.10) 

User interface (2.16) 
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Data mining apparatus for extracting correlation rule within relationship 
database - includes procedure file with series of procedure for 
converting relationship database into item database, after which 
correlation rule between items of database is extracted and 

Patent Assignee: MITSUBISHI ELECTRIC CORP (MITQ ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 11250084 A 19990917 JP 9849739 A 19980302 199949 B 

Priority Applications (No Type Date) : JP 9849739 A 19980302 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 11250084 A 16 G06F-017/30 

Abstract (Basic) : JP 11250084 A 

NOVELTY - Procedure file (312) records procedure for converting 
relational database into item database. A mining executing unit 
extracts correlation rule between items of database and outputs it as 
sequence rule file which is displayed. Procedure file edit unit 
processes attribute value of database, in order. DETAILED DESCRIPTION - 
Procedure edit unit performs Procedures such as digitization, grouping, 
displacing non- attribute value, deleting attributes, selecting 
records, itemization, amendments, deletion and modification etc. 
Procedure file application setting unit arranges one or more procedure 
file in order, used by preprocessing executing section (302) . 

USE - For extracting correlation rule within relationship database. 

ADVANTAGE - Since content of procedure file is changed and applied 
to relational database , several preprocessing praxis can be easily 
repeated. Since hierarchical structure obtained can be displayed by 
effecting conversion based on content of interval data dictionary , 
structure of data obtained by relationship database with application of 
procedure file can be understood with ease. DESCRIPTION OF DRAWING ( S ) - 
The figure shows detailed block diagram of data mining apparatus. (302) 
Preprocessing executing section ; (312) Procedure file. 
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Concurrent control method for multi server database system, B-trees 
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US 5920857 A 8 G06F-017/30 

Abstract (Basic) : US 5920857 A 

NOVELTY - Transactions (T) are split into sub-transactions <Tn) 
at transaction originator server and are executed using a two phase 
commit protocol. Logs of committed sub-transactions (CL) and 
sub-transactions (WL) ready to commit are maintained and verified at 
each server for each incoming transaction (Tn) . 

DETAILED DESCRIPTION - During transaction execution at the server, 
a logical time is incremented at each server machine. A transaction 
T(L,D,V) is accumulated at client in three sets with an insert set (I), 
delete set (D) and a verify set (V) comprising set of data items to be 
inserted, deleted and set of descriptions (P) which contains 
information that identifies data retrieval operations performed by the 
client with respect to server, the particular server subjected to the 
client data retrieval operations and a logical time stamp at the 
particular server. A transaction (T) is delivered from a client to the 
selected server which is being designated as the transaction originator 
server. An INDEPENDENT CLAIM is also included for query optimization 
method . 

USE - For multi server database system comprising multiple 
client and multiple server and for B-tree. 

ADVANTAGE - The computational load to the server is reduced and a 
fine granularity is implemented which improves the overall server 
performance. The use of synchronized physical clocks is eliminated by 
using logical clocks. 

DESCRIPTION OF DRAWING (S) - The figure shows the work of an 
optimistic concurrency control algorithm with logical time stamps. 
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Automatic text data display procedure of search inquiry related text in 
computer network - involves displaying text related to search inquiry 
automatically, after extracting them from database selected based on 
search inquiry 
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Abstract (Basic) : JP 11102376 A 

NOVELTY - A search inquiry is designated. Then, several 
databases related to search inquiry is chosen automatically after 
which at least one of the database is specified from among the selected 
ones. The document returned from these databases based on the order of 
relationship to search inquiry or in the order of ranking , is 
systemized, extracted and displayed automatically to user. DETAILED 



DESCRIPTION - An INDEPENDENT CLAIM is included for the automatic text 
data display apparatus. 

USE - In computer network. 

ADVANTAGE - Displays extracted text automatically from several 
databases which is related to a search inquiry. DESCRIPTION OF 
DRAWING (S) - The figure shows the schematic block diagram which 
explains automatic display of text related to search inquiry. 
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collection selection relative to a set of databases to obtain consistent 
relative- ranking collection selection results each iteration 

Patent Assignee: INFOSEEK CORP (INFO-N) 

Inventor: CHANG W I; KIRSCH ST 

Number of Countries: 081 Number of Patents: 004 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


WO 


9914691 


Al 


19990325 


WO 98US18844 


A 


19980910 


199920 


AU 


9892282 


A 


19990405 


AU 9892282 


A 


19980910 


199933 


US 


5983216 


A 


19991109 


US 97928294 


A 


19970912 


199954 


US 


6018733 


A 


20000125 


US 97928543 


A 


19970912 


200012 



Priority Applications (No Type Date): US 97928543 A 19970912; US 97928294 A 

19970912; US 97928542 A 19970912 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 9914691 Al E 46 G06F-017/30 

Designated States (National) : AL AM AT AU AZ BA BB BG BR BY CA CH CN CU 
CZ DE DK EE ES FI GB GE GH GM HU ID IL IS JP KE KG KP KR KZ LC LK LR LS 
LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR 
TT UA UG UZ VN YU ZW 

Designated States (Regional): AT BE CH CY DE DK EA ES FI FR GB GH GM GR 

IE IT KE LS LU MC MW NL OA PT SD SE SZ UG ZW 
AU 9892282 A G06F-017/30 Based on patent WO 9914691 

US 6018733 A G06F-017/30 
US 5983216 A G06F-017/30 

Abstract (Basic) : WO 9914691 Al 

NOVELTY - A collection selection query including a set of set 
search terms is obtained. An inverse collection frequency is 
determined for each search term with respect to each database and the 
set of databases. A document frequency is determined for each search 
term with respect to each database. A ranking value is determined for 
each database based on a sum of the products of the inverse collection 
frequencies for the search terms and the document frequencies for 
respective search terms. A subset of the set of databases is selected 
based on set criteria dependent on the ranking value for each 
database . 

DETAILED DESCRIPTION - The method involves: a) obtaining a 
collection selection query including a set of set search terms, b) 
determining an inverse collection frequency for each search term with 
respect to each database and the set of databases, and determining a 



document frequency for each search term with respect to each 
database, c) determining a ranking value for each database based on a 
sum of the products of the inverse collection frequencies for the 
search terms and the document frequencies for respective search 
terms, d) selecting a subset of the set of databases based on set 
criteria dependent on the ranking value for each database, and e) 
selectively repeating portions of the steps (b) through (d) with 
respect to each search term for each iteration of the method. 

USE - The method is used to permit iterative performance of 
collection selection relative to a set of databases, where each 
database includes several documents, to obtain consistent relative- 
ranking collection selection results each iteration. 

ADVANTAGE - Improves selection of most relevant collections for 
searching based on an ad hoc query . 

DESCRIPTION OF DRAWING (S) - The drawing shows a flow diagram 
illustrating the operation in supporting a meta-index database 
construction and user search . 
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Search method of multiple databases connected to network and 
developed independently - involves searching database which stores 
selected table, by producing data item and search conditions consisting 
of conditional expression of value 

Patent Assignee: NIPPON TELEGRAPH & TELEPHONE CORP (NITE ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 
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Priority Applications (No Type Date) : JP 97208871 A 19970804 
Patent Details: 
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Abstract (Basic) : JP 11053383 A 

NOVELTY - The detected data item, the data item containing the 
value which coincides to the search conditions within the candidate 
of a table, and a table are chosen. The database which stores the 
selected table is searched by producing the data item and the search 

conditions consisting of conditional expression of value. DETAILED 
DESCRIPTION - Using a dictionary consisting of the correspondence 
relation of the expression conversion of the data item of search 
conditions and the data item of a database, the conditions of a table 
and data item, and the range conditions of data item of a database, the 
expression of the data item of the designated search conditions is 
converted to the expression of the data item of a database. The 
candidate of the data item of the database for search and a table is 
detected from the data item to which the conversion of search 
conditions is performed. An INDEPENDENT CLAIM is also included for a 
search program recording medium. 

USE - For searching databases connected to network and developed 
independently. 

ADVANTAGE - Avoids wasteful search of database without value 
corresponding to concrete value of search conditions, since data item 



and table relevant to data item of search conditions can be searched 

from multiple databases . Raises possibility of search of data 
item since data item name of designated search conditions can be made 
to correspond to data item name of in a related database. DESCRIPTION 
OF DRAWING (S) - The figure shows the block diagram of a search 
apparatus . 
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SQL queries optimization method in relation database management 

system 

Patent Assignee: NCR CORP (NATC ) 
Inventor: KRAUS T B; RAMESH B; WALTER T A 
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Abstract (Basic) : US 5884299 A 

NOVELTY - The query is examined to determine if it includes one or 
more aggregation operating on rows of a table in relational database . 

Several local aggregate result rows are created by aggregating rows 
of table by aggregation operation. The aggregation result rows are 
redistributed to several global aggregation operations to create 
several global aggregate result rows . 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
query optimizing apparatus. 

USE - For optimizing SQL queries in relation database 
management system using aggregate or grouping function. In MPP computer . 
system. 

ADVANTAGE - The queries are splitted into sub-queries by a single 
processor in order to minimize the overhead associated with the 
processing of the entire query. The sub-queries are performed 
simultaneously on a single processor using a multitasking operating 
environment . 

DESCRIPTION OF DRAWING (S) - The figure represents flow chart for 
the execution of the global aggregation in SQL queries optimization 
method . 
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Keyword processing method for electronic documentation - involves 
selecting extracted keywords existing in less number of documents , as 
search keywords, which are then coupled by logical expression, for 
searching document in database 

Patent Assignee: NTT DATA TSUSHIN KK (NITE ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 10320403 A 19981204 JP 97124562 A 19970514 199908 B 

Priority Applications (No Type Date) : JP 97124562 A 19970514 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 10320403 A 8 G06F-017/30 

Abstract (Basic) : JP 10320403 A 

The method involves performing morphological analysis of designated 
document groups stored in a database , for extracting several 
keywords. The number of documents, in which each extracted keyword 
exists, is detected, with reference to a keyword document frequency 
dictionary (12) . 

The keywords which exist in less number of documents are selected 
as search keywords. The selected search keywords are coupled by 
logical expression, for searching the document in the database. 

ADVANTAGE - Improves document searching efficiency, greatly. 
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SQL database access method - involves associating and searching data of 
several databases , when relation of data item in table is detected to 

be effective 
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Abstract (Basic) : JP 10301821 A 

The method involves registering the type and range of the value of 
a data item into a dictionary . Then, the relation of the data item in 
a table is detected by using the dictionary - When the detected 
relation is found to be effective, the data of several databases 
are associated and searched by using the detected relation. 

ADVANTAGE - Searches data between several databases , simply. 
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Similar vector data search apparatus - calculates weight factor based 
on vector data designated by correct answer selector when displayed 
searched similar vector data is judged to be incorrect 

Patent Assignee: MITSUBISHI ELECTRIC CORP (MITQ ) 
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Patent No Kind Date Applicat No Kind Date Week 

JP 10228475 A 19980825 JP 9729400 A 19970213 199844 B 

Priority Applications (No Type Date) : JP 9729400 A 19970213 
Patent Details: 
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JP 10228475 A ■ 16 G06F-017/30 

Abstract (Basic) : JP 10228475 A 

The apparatus includes an object data addressing unit that 
designates vector data for similar vector data searching ■ A vector 
database stores several vector data. A similar vector data 
searching data unit (3) searches similar vector data within several 
vector data based on the weight factor. 

A display unit displays the searched similar vector data. Based 
on the operation of the user it is judged whether the displayed 
searched similar vector data are correct data or incorrect data. A 
three vector weight reabsorption unit (8) calculates the weight 
factor based on the vector data designated by a correct answer selector 
(7) when the displayed searched similar vector data is judged to be 
incorrect . 

ADVANTAGE - Prevents deterioration in searching accuracy. Does 
not revise weight factor when displayed searched similar vector is 
incorrect. Facilitates to setup new weight vector. 
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Database management system e.g. for computer - has search unit that 
searches , according to set search ranking on database 
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Abstract (Basic) : JP 10143516 A 

The system has a setting unit which sets the search ranking for 
several classification items of different attributes. A search unit 
searches a database (13) according to set up ranking . The database 
consisting of several datagroups comprising attributes with one or 
more classification items, is searched based on set ranking . 

ADVANTAGE - Reduces search time and burden on computer. 
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Machine assisted translation system - analyses outputs of search and 
substitution units and produces that output with higher degree of 
correspondence with input sentence as final translation result 

Patent Assignee: SHARP KK (SHAF ) 
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Abstract (Basic) : JP 10097538 A 

The system includes a bilingual database which stores various 
sentences in one language and their equivalent in the other language. 
An implicative dictionary which stores the implication of various 
notations used in one language, and a variable implication dictionary 
which stores other possible implications of the notation are also 
included. A sentence to be translated is input through an input unit. 

The bilingual database and the two dictionaries are searched 
for translating the input sentence, by a search unit. A substition 
unit provides an alternate translation result as a substitute. The 
outputs of the search and substitution units are analysed and the 
degree of correspondence with the input sentence is obtained. The input 
that has a higher degree of correspondence is produced as the final 
output . 

ADVANTAGE - Produces translated sentences with almost correct 
semantics reliably. 
Dwg. 3/21 

Title Terms: MACHINE; ASSIST; TRANSLATION; SYSTEM; ANALYSE; OUTPUT; SEARCH 
; SUBSTITUTE; UNIT; PRODUCE; OUTPUT; HIGH; DEGREE; CORRESPOND; INPUT; 
SENTENCE; FINAL; TRANSLATION; RESULT 

Derwent Class: T01 

International Patent Class (Main) : G06F-017/28 
File Segment: EPI 



26/5/54 (Item 44 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2003 Thomson Derwent. All rts . reserv, 

011809548 **Image available** 

WPI Acc No: 1998-226458/199820 

XRPX Acc No: N98-179942 

Data search parallel database search method for RDBMS - involves 
extracting sub data from database operation server in response to 
enquiry, based on positional information of data, dictionary 
information relevant to sub data and identifier of registered sub data 
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Abstract (Basic) : JP 10069488 A 

The method involves using multiple database operation servers 
(13) connected to a network. A front end server (12) analyses the 
enquiry from a database. The database operation server performs data 
search out operation, and outputs the positional information of data 
consisting of sub data to the front end server, as search result. 

The front end server performs processing and control of search 
result. Each sub data is extracted from the database operation server 
in response to a next enquiry, using the positional information, 
dictionary information relevant to the position of sub data and 
identifier of registered sub data. 

ADVANTAGE - Enables reduction of enquiry time and communication 
time by forwarding only data required for next process. 
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Multiple aggregation level database query result generation method 
- involves receiving query request specifying ranked aggregation 
levels specifying grouping fields and producing result table of all 
source table records specified by aggregation level and selecting 
specific value 
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Abstract (Basic) : US 5713020 A 

The query generation method involves receiving a query request 
specifying numerous ranked aggregation levels specifying grouping 
fields. For a superior aggregation level a result table is produced 
with an aggregation level from all records of a source table specified 
by the grouping field specified by the aggregation level. A distinct 
value is selected. For each aggregation level inferior to at least one 
other level a result table is produced with an aggregation level from 
all records of a source table having a selected value of a grouping 
field specified by the superior aggregation level and inferior to the 
largest number of other aggregation levels. A distinct value of the 
grouping field is selected from this table. 

A different distinct value of the grouping field is selected 
specified by one of the aggregation levels. For each aggregation level 
inferior to the aggregation level for which a different distinct value 
of the grouping field is selected the existing result table for the 
aggregation level with result table is replaced from all records of the 
source table with values of grouping field specified by aggregation 
level superior to aggregation level and inferior to largest number of 
other aggregation levels. A distinct value of the grouping field 
specified by the aggregation level occuring in a record of the source 
table having the selected value of the grouping field specified by the 
aggregation level superior to the aggregation level and inferior" to the ' 
largest number of other aggregation levels. 

USE - Generates and displays multiple-level and cross-tab 
aggregation query results. Provides application programming interface 
for multi-level and cross-tab queries . 

Dwg. 17/18 

Title Terms: MULTIPLE; AGGREGATE; LEVEL; DATABASE; QUERY ; RESULT; 
GENERATE; METHOD; RECEIVE; QUERY ; REQUEST ; SPECIFIED; RANK ; 
AGGREGATE; LEVEL; SPECIFIED; GROUP; FIELD; PRODUCE; RESULT; TABLE; SOURCE 
; TABLE; RECORD; SPECIFIED; AGGREGATE; LEVEL; SELECT; SPECIFIC; VALUE 

Derwent Class: T01 

International Patent Class (Main) : G06F-017/30 
File Segment: EPI 



26/5/56 (Item 46 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2003 Thomson Derwent. All rts. reserv. 

011675073 **Image available** 

WPI Acc No: 1998-091982/199809 

XRPX Acc No: N98-073215 

Translator with dictionary searching function - has temporary data 
memory that stores data output by both translation processing unit and 
dictionary - searching processor, and output display unit that displays 



output data 
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Abstract (Basic) : JP 9319751 A 

The translator has an input designating unit (1) that inputs a 
sentence described in a particular language. A translation dictionary 
(3) stores the e.g. word dictionary used for translation, grammar 
dictionary , syntax rule. The input sentence is translated into another 
language by a translation processing unit (2) . 

During the translation process, a dictionary - searching 
processor (6) searches a word dictionary database (7) for data 
relating to the dictionary data of the input sentence. The output 
data from the translation processing unit and dictionary - searching 
processor are stored into a temporary data memory (8) and displayed on 
an output display unit (5) . 

ADVANTAGE - Simplifies interrogative dissolution of translation 
result via machine translation. Enables desired data to be obtained 
effectively from several dictionary databases since only required 
grammar analysis result is displayed, hence minimising unsuitable 
translation results . 
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Document search and retrieve method using several database over 
computer network - involves applying search query from client to each 
server associated with each database, at each server list of relevant 
documents is determined 
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Abstract (Basic) : US 5659732 A 

The method involves applying a search query from the client to 
each server associated with each database, at each server a list of 
relevant documents is determined. Statistics about each database are 
obtained, at the client from each server. Information about the 
relevant documents resulting from application of the query to the 
associated database is obtained at the client from each server. A 
relevance score for each document is computed at the client, using the 
statistics and the information whereby the computed relevance score is 
used in determining how the relevant documents from all of the 
databases should be ordered in a list of merged relevant documents. 

ADVANTAGE - Capable of searching multiple collections oh single 
pass with ranking of documents. 
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FIM system for integrating data from multiple interconnected local 
databases to provide users with access to virtual database - has user 
interface for generating global query to search virtual database, DIM 
that decomposes global query into local queries , and number of LIMs 
that execute local queries to search enumerated databases 

Patent Assignee: HUGHES AIRCRAFT CO (HUGA ) 

Inventor: NOBLE W B; PATEL B K; WANG J K 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 
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Abstract (Basic) : US 5634053 A 

The database controller comprises the user interface for generating 
a global query to search the virtual data base, which has an 
associated global format, the global query including at least one 
data field from a set of commonly used data fields whose values are 
represented in an input format. A smart data dictionary (SDD) 
contains configuration data for each of the local databases including 
respective local formats for each of the commonly used data fields . A 
selector selects the input format for generating the global query 
from one of the global and local formats. An input translator converts 
the value of the data field in the global query into local values in 



the respective local formats . 

The data information manager (DIM) generates local queries 
including the local values for the data field in response to the global 

query and in accordance with the respective configuration data. A 
number of local information managers (LIMs) execute the local queries 
to search for and retrieve from the respective local databases data 
that is associated with the local values of the data field, the LIMs 
passing the data back to the dimension where it is combined to present 
the requesting user with an integrated response. An output translator 
converts the data passed back from the LIMs from their respective local 
formats into the input format so that the data can be combined to 
present the user with the integrated response. 

ADVANTAGE - Efficiently and truly integrates data from number of 
interconnected and heterogeneous local databases to provide user's with 
access to virtual database. Better user friendliness. Increases 
completeness of search - 
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Identifying textual documents and multimedia files corresponding to 
search topic - accepting query and returning single search results 
list having text and multimedia information 
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Abstract (Basic) : WO 9710537 A 

The method for identifying textual documents and multimedia files 
involves storing a number of document and multimedia records each of 
which represent a document or multimedia file. The document records 
have associated text information fields, each of which represents text 
from one of the textual documents, and the multimedia records have 
multimedia information fields representing only digital video or audio 
information and associated text fields, each representing text 
associated with one of the multimedia information fields. 

A single search query corresponding to the search topic is 
received pref in a natural language format, and an index database is 
searched in accordance with the single search query to 
simultaneously identify document records and multimedia records related 
to the single search query . A search result list having entries 
representing both textual documents and multimedia files related to the 
single search query is generated in accordance with the document 
records and the multimedia records identified by the index database 
search . Text or digital video or audio information corresponding to 
the search topic is retrieved by selecting entries from the. search 
result list. 

USE - Automated multi-user system for identifying and retrieving 
text and multi-media files from various publisher sources. 

ADVANTAGE - Enables searching and retrieval of library or 
database to identify text documents and multimedia files relevant to 
query • 
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Database search system e.g. for patent official report, scientific paper, 
newspaper report - has index search part and whole sentence search part 
with which database is searched according to input search type 

Patent Assignee: NIPPON STEEL CORP (YAWA ) 
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Patent No Kind Date Applicat No Kind Date Week 
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Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 8272806 A 8 G06F-017/30 

Abstract (Basic) : JP 8272806 A 

The system has a database storing part (11) in which the database 
is searched. A search type is given as the input to an input part (15). . 
A division part (21) divides the search type to a single term type. 
Two search parts, an index search part (13) and a whole sentence search 
part (14) are also provided. 

An assignment part (32) assigns the single term type to both the 
search parts respectively. An arithmetic part (33) carries out logical 
operation of the results obtained from the index search and the whole 
sentence search parts, based on the search type and gives an output to 
a display part (16) . 



ADVANTAGE - Searches database efficiently , using multiple 
search techniques . 
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Parallel text search system for text search processing - has feedback 
unit which repeats searching again using parallel calculation until 
question sentence vector satisfies user conformity for search result 
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Abstract (Basic) : JP 8263517 A 

The system has a wt . factor generator (1) which forms a wt . vector 
by calculating the wt . degree of importance for each term. A memory 
(11) stores the x piece number to distribute the wt. vector of each 
text formed by the wt . factor generator. 

A processor calculates and outputs the score in which the 
similarity between each text wt . vector and the wt . vector corresp. to 
a question sentence is shown. A search processor (2) outputs the 
search result and rearranges the scores in a descending order based on 
the output score. A feedback unit repeats searching using parallel 
calculation depending on the question sentence vector updated according 
to the conformity of the user. 

ADVANTAGE - Enables processing several text databases since 
speedy text search is performed. 
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Providing extensible query architecture for information retrieval system 
- includes search application that has variety of code module classes, 
each implementing specific type of query model on data types in database 
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Abstract (Basic) : WO 9618159 A 

The system has an extensible query architecture which allows an 
applications programmer to integrate new query models into the system 
as desired. The architecture is based on an abstract base class of 
query nodes, or code objects that retrieve records from the database. 
Specific sub-classes are derived from the base class. Each query node 
class includes a search function that iteratively searches the 
database for matching records. Query node objects are instantiated by 
associated node creator class objects. 

A parser is used to parse a search query into its components, 
including nested search queries used to combine various query models. 
The parser determines the particular search operator keywords and the 
node creator object. The node creator objects return pointers to the 
created query nodes. 

ADVANTAGE - Allows parser to assemble complex hierarchical query 
nodes that combine multiple query models. 
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Goods and services computer assisted brokering system - uses database 
with buyer and seller interfaces containing multimedia information 
describing respective goods and services 
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Abstract (Basic) : WO 9524687 A 

The computer implemented system for brokering transactions between 
sellers and a buyer of goods or services has a database containing 
information, including multimedia information, descriptive of 
respective goods and services. A seller interface interactively enables 
the seller to enter the descriptive information, including the 
multimedia information, into the database. 

A buyer interface interactively uses a knowledge-based protocol, 
enabling the buyer to select and review the descriptive information 
from the database. The buyer interface makes perceptible the multimedia 
information in response to an interactive buying request . 

USE/ADVANTAGE - Allows information to be submitted to buyer in 
number of forms. Records all transactions automatically. 
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RUS 

54 2648400 STRENGTH OR WEIGHT? OR SIGNIFICANCE OR INFLUENCE OR IMPORT- 

ANCE OR RANK? 

55 1372381 DATABASE? OR DATA( ) BASE? 

56 3358347 PARTITION? OR PARSE OR PARSING OR SPLIT? OR DIVIDE? OR SEC- 

TION? OR SEGMENT? OR SEPARATE? (5N) SI 

57 32220 (OPTIMIZ? OR PERFECT? OR FUNCTION? OR EFFECTIVE? OR EFFICI- 

ENT?) (2N) SI 

58 2820 S3 (S) S4 

59 1395 S6 (S) S2 

510 3 S8 (S) S9 

511 15 S9 (S) S7 

512 29 S9 (S) S3 

513 41 S9 (S) S4 

514 413 S2 (S) S3 

515 19 S14 (S) S8 

516 24 S8 (S) S7 

517 2820 S8 (S) S4 

518 7216 SI (S) S2 

519 7 S18 (S) S8 

S20. 7 SI (S) S2 (S) S3 (S) S4 

521 7 S20 (S) S4 

522 3 S12 (S) S13 

523 57 S10 OR Sll OR S15 OR S16 OR S19 OR S20 OR S21 

524 49 S23 NOT PY>2001 

525 44 S24 NOT PD>20010228 

526 42 RD (unique items) 



File 15:ABI/Inform(R) 1971-2003/Dec 12 

(c) 2003 ProQuest Inf o&Learning 
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File 647: CMP Computer Fulltext 1988-2003/Dec Wl 

(c) 2003 CMP Media, LLC 
File 275: Gale Group Computer DB(TM) 1983-2003/Dec 11 

(c) 2003 The Gale Group 
File 674:Computer News Fulltext 1989-2003/Dec Wl 
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File 696: DIALOG Telecom. Newsletters 1995-2003/Dec 11 

(c) 2003 The Dialog Corp. 
File 624 : McGraw-Hill Publications 1985-2003/Dec 11 

(c) 2003 McGraw-Hill Co. Inc 
File 636:Gale Group Newsletter DB(TM) 1987-2003/Dec 11 

(c) 2003 The Gale Group 
File 813: PR Newswire 1987-1999/Apr 30 

(c) 1999 PR Newswire Association Inc 
File 613: PR Newswire 1999-2003/Dec 12 

(c) 2003 PR Newswire Association Inc 
File 16:Gale Group PROMT (R) 1990-2003/Dec 11 

(c) 2003 The Gale Group 
File 160:Gale Group PROMT (R) 1972-1989 

(c) 1999 The Gale Group 
File 553:Wilson Bus. Abs . FullText 1982-2003/Oct 

(c) 2003 The HW Wilson Co 



26/3 ,K/1 (Item 1 from file: 15) 

DIALOG (R) File 15 :ABI/ Inform (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
02035945 55373353 

User interface design for speech -based retrieval 

Oard, Douglas W 

American Society for Information Science. Bulletin of the American Society 
for Information Science v26n5 PP: 20-22 Jun/Jul 2000 
ISSN: 0095-4403 JRNL CODE: BAS 
WORD COUNT: 2202 

...TEXT: the program title or the source (e.g., broadcast network) are 
shown. If we wish to support effective natural language searching , we 
will probably need to provide the user with a far richer view of the search 
results . . . 

has explored the use of named entity extraction and automatic 
classification to associate proper names and controlled vocabulary 
keywords with a speech recognition transcript. At the University of 
Maryland we are exploring the utility of . . . 



2 6/3, K/ 2 (Item 2 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning. All rts. reserv. 

01485274 01-36262 
Compression theory 

McGillis, Peggy; Nichols, Mina; Terry, Britt 
Computer Technology Review PP: 60-61+ Summer 1997 
ISSN: 0278-9647 JRNL CODE: CTN 
WORD COUNT: 24 98 

...TEXT: for the symbol. The longer the match, the better the compression 
ratio . 

An advantage to using a dictionarybased method is that dictionary- 
entries may be of various lengths. For instance, an incrementing pattern of 
OOh to FFh may require only one entry into the dictionary . Patterns 
consisting of continuous repeating data, such as all FFh or all OOh, will 
compress very efficiently assuming the maximum dictionary word length is 
sufficiently large to describe the repeating portion. The importance of 
this technology in today's backup devices is huge. The data stored on disk 
drives is typically very redundant in cases where the disk or a relational 
database tablespace is not full. Also, many tablespaces of database 

files contain repeating text which can easily be included in the 

dictionary . 

This method of data compression focuses primarily on the encoding 
dictionary. Simple coding methods generally focus on... 

26/3, K/3 (Item 3 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
01280818 99-30214 

A new patent search tool for the Internet: QPAT-US 

Lambert, Nancy 

Database vl9n4 PP: 56-61 Aug/Sep 1996 
ISSN: 0162-4105 JRNL CODE: DTB 
WORD COUNT: 3247 

...TEXT: every one of the top-ranked included one of the misspelled words 
in its text. 

The second vocabulary aid produces a list of "statistically related" 
terms from which to choose. The system looks at the... 



. . . search produced and generates a list of other terms in the documents by 
a sort of relevance ranking : terms that occur most frequently in these 
documents compared to their frequency in the whole database. From... 

catalytic, " and "catalyzed, " and the resulting search set was 
significantly enlarged (183,330 patents), suggesting that the search term 
stemming function did not include all these terms. The "statistically 
related" function also suggests nonalphabetically-related terms of interest 



26/3 ,K/4 (Item 4 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Info&Learning. All rts. reserv. 

01231692 98-81087 
The engines that can 

Joss, Molly W; Wszola, Stanley 

CD-ROM Professional v9n6 PP: 30-38+ Jun 1996 
ISSN: 104 9-0833 JRNL CODE: LDP 
WORD COUNT: 5238 

...TEXT: a search engine equipped to manage databases of multi-terabyte 
size. The engine serves numerous concept-based search functions ; for 
example, it understands idiomatic phrases, and automatically expands the 
user's search term to cover sets of equivalent words through its more than 
a quarterof-a-million-word thesaurus . 

(Table Omitted) 

(Photograph Omitted) 

Perhaps even further along the path of nontraditional text retrieval is HNC 
Software . . . 



26/3, K/5 (Item 5 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Info&Learning. All rts. reserv. 
01172548 98-21943 

Supplement to 1995 ASIS annual meeting proceedings 

Anonymous 

American Society for Information Science. Bulletin v22n2 PP: 21-27 Dec 
1995/Jan 1996 

ISSN: 0095-4403 JRNL CODE: BAS 
WORD COUNT: 5098 

. . .TEXT: in SGML markup format and include text, figures, images and 
equations. The Illinois DLI Project is investigating effective search 
and retrieval database structures and interface designs that utilize the 
ability of SGML to identify the content... 

set includes multimedia context-sensitive help and demonstration 
searches; dynamic word wheel displays (letter-by-letter word dictionary 
displays); word spell checking; search trees, result ranking and best 
match searching; and links to thesauri and related word strings generated 
by co-occurrence rankings . 

Ray R. Larson, School of Library and Information Studies, University of 
California at Berkeley, Berkeley, California Cheshire... 



2 6/3, K/ 6 (Item 6 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Info&Learning. All rts. reserv. 



01094347 97-43741 

The ONLINE 100: ONLINE Magazine's Field Guide to the 100 Most Important 
Online Databases 

Gehrig, Virginia Gatcheff 

Information Today vl2n8 PP: 19, 22 Sep 1995 
ISSN: 8755-6286 JRNL CODE: I FT 
WORD COUNT: 531 

...TEXT: much less painful with his collection of the best 100 databases. 
The book is a directory of various types of databases available in the 
online world. Each database profile contains a brief description of the 
database, a "Content Notes" section , which summarizes the content of the 
database, a "Search Notes" section , which gives tips on effective 

searching , a section called "Do Not Use For, " which notes the 
limitations of the database, and the "Key Facts" section , which lists the 
time span of the database, the producer, which systems carry it, where to 
find. . . 



26/3, K/7 (Item 7 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning. All rts. reserv. 

01084169 97-33563 

Finding case studies online 

Ojala, Marydee 

Online vl9n5 PP: 32-36 Sep/Oct 1995 
ISSN: 0146-5422 JRNL CODE: ONL 
WORD COUNT: 3039 

...TEXT: to help you find case studies. Look at UMI 1 s files, for example. 
ABI/ INFORM recognized the importance of case studies early and created a 
standalone thesaurus term for the concept. A search for your topic 
combined with the descriptor term, Case Studies, makes a very effective 

search strategy. This descriptor is not used in UMI 1 s Business Dateline 
file, although one record did manage... 



26/3, K/8 (Item 8 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 

01059068 97-08462 
DataTimes ' big move 

O Leary, Mick 

Information Today vl2n6 PP: 16-17+ Jun 1995 
ISSN: 8755-6286 JRNL CODE: I FT 
WORD COUNT: 1873 

...TEXT: diverse set of databases. The screens are bright, attractive, and 
uncluttered. In both novice and command modes, search steps are 
efficiently and logically presented. Documentation in the Windows Help 
section is clear and thorough. EyeQ's major weakness is in the 
arrangement of databases . Several preformatted groupings are provided, 
but it is not easy to tell what sources are in what category. . . 



26/3, K/9 (Item 9 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 

01045495 96-94888 

Silver Platter CD-ROM discs 

Ashworth, Wilfred 

New Library World v96nll21 PP: 37 1995 
ISSN: 0307-4803 JRNL CODE: NLW 
WORD COUNT: 658 



...TEXT: is almost a quarter of an inch thick and runs to 92 pages listing 
more than 200 databases . Many of these databases are available from 
other suppliers but differ in the layout and in the search software which 
accompanies . . . 

...their discs--a distinct advantage because it has to be learned only once 
and the owner of several databases does not have to install special 
software for each which would take up valuable space on hard disk. 
Currently the search software comes on a separate CD-ROM which will 
install either SPIRS for DOS, or WINSPIRS (the Windows version) . It also 
carries . . . 

. . . edition of a textual database is one which can be confidently 
recommended for ease of use and effective searching . 

Nursing and Allied Health (CINAHL) is a comprehensive database of citations 
to nursing and health literature, 1983... 



26/3, K/10 (Item 10 from file: 15) 

DIALOG (R) File 15 :ABI/ Inform (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00771568 94-20960 

Search patterns of remote users: An analysis of OPAC transaction logs 

Millsap, Larry; Ferl, Terry Ellen 

Information Technology & Libraries vl2n3 PP: 321-342 Sep 1993 
ISSN: 0730-9295 JRNL CODE: JLA 
WORD COUNT: 9336 

...TEXT: A zero retrieval in itself is neutral. It must always be viewed in 
some context to gain significance . For example, a knowledgeable observer 
may detect that a search word is misspelled, a command word is invalid, a 
search term does not match the controlled vocabulary of the index being 
searched , and so forth. Or the observer may note that a succession of 
searches strongly suggests the user is in the wrong database. 

Nevertheless, OPAC system designers and researchers remain concerned about 
large numbers of searches with zero retrievals. In 1988, Clifford Lynch 
stated that the statistics for zero retrievals in the MELVYL catalog were 
"alarming." At that time, about 31.5% of MELVYL searches in COMMAND mode 
resulted in zero retrievals. The figure still remains at that level. For 
Lynch' s concerns on this and related matters, see his "Large Database and 
Multiple Database Problems in Online Catalogs, " in OPACs and Beyond 

(Dublin, Ohio: OCLC, 1989) . 
18. A MELVYL catalog session... 



26/3, K/ll (Item 11 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00728298 93-77519 

"Overnight Delivery" and Records Management - Microwave Style 

Hartinger, Verna J. 

Information Today v9n7 PP: 30-32 Jul/Aug 1992 
ISSN: 8755-6286 JRNL CODE: I FT 
WORD COUNT: 1479 

...TEXT: member, expanding her role to system administrator/database 
analyst, has since successfully converted or designed over 60 databases 
for numerous applications including a product literature file with 
integrated thesaurus , current awareness publication production system 
with Microsoft Word compatible output formats, library book acquisitions 
chargeback and statistical reporting system, online catalog, serials 
check-in and routing, literature search chargeback, and statistical 
reporting system with output to EXCEL, etc. A major strength of this 
system is the ability to bring up new applications virtually with the speed 
of light. . . 



26/3,K/12 (Item 12 from file: 15) 

DIALOG (R) File 15 :ABI /Inform (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00663577 93-12798 

Belief function model for information retrieval 

da Silva, Wagner Teixeira; Milidiu, Ruy Luiz 

Journal of the American Society for Information Science v44nl PP: 10-18 
Jan 1993 

ISSN: 0002-8231 JRNL CODE: ASI 

ABSTRACT: The Belief Function Model (BFM) for automatic indexing and 
ranking of documents with respect to a given user query is presented. 
This model is based on a controlled vocabulary , like a thesaurus , and 
on term frequencies in each document. Descriptors in the vocabulary are 
terms selected from among their synonyms to be used as index terms. It is 
possible for. . . 

. . . models are not adequate to handle them. However, a belief function can 
still be defined over a thesaurus of descriptors. Belief functions over 
the descriptors can represent a document or a user query. The agreement 
between a document belief function and a query belief function can be 
computed. Therefore, it is proposed that the set of documents be ranked 
according to their agreement with the given user query. The BFM is shown to 
be wider in. . . 



26/3, K/13 (Item 13 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00655519 93-04740 

Holding the Reins on Distributed Databases 

Wolfson, Ken 

Chief Information Officer Journal v5n2 PP: 48-51 Fall 1992 
ISSN: 0899-0182 JRNL CODE: CJL 
WORD COUNT: 1664 

...TEXT: basic concepts. 

WHAT IS A DISTRIBUTED DATABASE? 

Distributed database is a catch-all term used to describe several types 
of database processing capabilities — specifically, remote request, remote 
unit of work, distributed unit of work, and distributed request. Of... 

. . . database processing, only the distributed unit of work and distributed 
request support transactions in which data are split across two or more 
physical databases. This is what people usually think of when they hear 
"distributed. . . 

. . . per transaction. Some basic definitions follow. For consistency, the 
term client is used to describe any application function that requests 
services (e.g., create, read, update, delete) from a database. 

REMOTE REQUEST. A remote request allows a... 



26/3, K/14 (Item 14 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00495452 90-21209 

Semi -Automatic Determination of Citation Relevancy: User Evaluation 

Huffman, G. David 

Information Processing & Management v26n2 PP: 295-302 1990 
ISSN: 0306-4573 JRNL CODE: IPM 



ABSTRACT: Online bibliographic, database searches typically generate 
hundreds of retrieved citations with only about 20%-40% relevant to the 

search topic and/or problem statement. A significant amount of time is 
required to categorize and select the relevant citations. A software 
system, SORT-AID/ SABRE, has been developed that analyzes citations from 

various databases , reviews, categorizes, and searches citations for 

specified text strings, ranks the citations by relevance, and prints the 
citations in a user-specified format. The citation- ranking process uses 
a thesaurus of terms selected from the citations using lexical 
association metrics. The user provides a relevancy evaluation of each term. 
The user term assessment is combined with lexical association metrics, and 
the citations are ranked by relevance. A comprehensive user evaluation of 
the relevance- ranking procedures shows that the software generated 
distributions approach those of the end user in 22% of the... 



26/3, K/15 (Item 15 from file: 15) 

DIALOG (R) File 15 : ABI/Inf orm (R) 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 
00486100 90-11857 

A Framework for Evaluating CD-ROM Retrieval Software 

Nicholls, Paul; Han, Isaac; Stafford, Karen; Whitridge, Katherine 
Laserdisk Professional v3n2 PP: 41-46 Mar 1990 
ISSN: 0896-4149 JRNL CODE: LDP 

...ABSTRACT: the retrieval engine supplied with the product. The 
proliferation of software and the increasing availability of single 
databases under several different access programs make software 
evaluation an important component in the overall CD-ROM assessment process. 
When. . . 

.... most important evaluation criteria are users, requirements, and 
constraints. Other evaluation criteria for access software can be divided 
into 5 broad categories: 1. hardware and software dependencies, 2. 
interface features, 3. search and retrieval functions , 4. output 
functions, and 5. general production features. A checklist is provided that 
outlines the general evaluation. . . 



. 26/3, K/16 (Item 1 from file: 647) 

DIALOG (R) File 647: CMP Computer Fulltext 
(c) 2003 CMP Media, LLC. All rts. reserv. 

00598125 CMP ACCESSION NUMBER: CWK19911209S0297 

Gupta Preps SQLBase NLM-Claims database performs at twice the speed of 
Oracle's NLM 

MICHAEL DORTCH ; STANLEY GIBSON 
COMMUNICATIONSWEEK, 1991, n 381, 1 
PUBLICATION DATE: 911209 

JOURNAL CODE: CWK LANGUAGE: English 

* RECORD TYPE: Fulltext 
SECTION HEADING: News 
WORD COUNT: 738 

... features discussed in August, which the CTA's Saks said apparently 

have been implemented, included support for databases partitioned 
across multiple disk drives or servers, faster and more efficient 
database queries , and maintenance of data integrity during accesses and 
manipulations by multiple users. 

All versions of SQLBase Server... 



26/3, K/17 (Item 1 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 



01998663 SUPPLIER NUMBER: 18733376 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Performance tuning. (Seven DBMS utilities) (Product Information) 
Rennhackkamp, Martin 
DBMS, v9, nil, p85(5) 
Oct, 1996 

ISSN: 1041-5173 LANGUAGE: English RECORD TYPE: Fulltext; Abstract 

WORD COUNT: 5474 LINE COUNT: 00429 

when reasonably stable tables are often joined by exact-match 

queries . 

You can select whether Oracle must optimize your queries using 
its older rule-based optimizer or its newer cost-based optimizer. The 
rule-based optimizer chooses an execution plan based on the available 
access paths and the ranks of these access paths in a published table. 
The cost-based optimizer chooses an execution plan based on the available 
access paths as well as on statistics in the data dictionary for the 
tables, clusters, and indexes. You can also add so-called "hints" (or 
optimization suggestions) to. . . 



26/3, K/18 (Item 2 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01839641 SUPPLIER NUMBER: 17486933 

Winning the client-server game. (ODBC) 

Malik, A. Nicklas 

Windows Tech Journal, v4, n8, p28(5) 
August, 1995 

ISSN: 1061-3501 LANGUAGE: English RECORD TYPE: Abstract 

...ABSTRACT: Jet engine provides a common way to use ODBC. Jet includes a 
full SQL engine, and can parse SQL statements and optimize queries . 
Jet is tuned to interface with server data, and permits both forward and 
backward motion without having to manage multiple database connections. 
Jet's Data Object layer provides a single access method, and makes 
databases available to custom. . . 



26/3, K/19 (Item 3 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01776287 SUPPLIER NUMBER: 16854952 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Complex databases can speed response via parallel hardware, software 
design. 

Gallagher, Bob 

PC Week, vl2, nl6, p83(l) 

April 24, 1995 

ISSN: 0740-1604 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 1176 LINE COUNT: 00096 

... already added them to their products. The databases simply 

coordinate multiple load or dump tasks running on separate processors. 

Optimizing queries for quicker response is more difficult and is 
probably the main area on which vendors of parallel... 



26/3 # K/20 (Item 4 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01707742 SUPPLIER NUMBER: 16285678 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Pass the word, keep the data. (BrainTree Technology Inc's SQLOSecure 
Database Password Manager) (Product Announcement) 

Morrison, Kristine M. 

DEC Professional, vl3, n9, pl6(l) 

Sept, 1994 



DOCUMENT TYPE: Product Announcement ISSN: 0744-9216 LANGUAGE : 

ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 518 LINE COUNT: 00042 



...ABSTRACT: database systems. The software includes a password client and 
server, and a password checker. The client portion requests usernames and 
passwords, and the server keeps a table of password data. The table can be 
accessed. . . 

...server can synchronize database and operating system passwords, and can 
update other password servers for synchronization across multiple 
databases . The server also contains a dictionary of passwords that are 
thought to be easily guessable, and will provide users with a ranking of 
their password's guessability . 



26/3, K/21 (Item 5 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01687692 SUPPLIER NUMBER: 15516955 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Xyvision' s PDM: integrated document and workflow management. (Parlance 
Document Manager document sharing and workgroup software) (includes 
related articles on a glossary of Xyvision PDM terms, an Xyvision company 
profile and PDM system pricing) (Cover Story) 
Karsh, Arlene E. 

Seybold Report on Publishing Systems, v23, nl7, p3{27) 
May 30, 1994 

DOCUMENT TYPE: Cover Story ISSN: 0736-7260 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 23311 LINE COUNT: 01871 

into managing more diverse objects, including video, animation and 
even sound, other viewers will also be necessary. * Multiple database 
access. Xyvision has gone the extra mile to accommodate user requests for 
additional functionality thus far, and we would expect it to continue in 
this manner. The new Windows client, undoubtedly. . . 

...of the software. Another user-oriented refinement that we think deserves 
attention is the ability to access multiple databases (currently there 
can be only one) using the same sql sequences and interface. Many sites 
will have. . . 
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01651543 SUPPLIER NUMBER: 15590489 

Progress in database search strategies. 

Yu, Clement; Meng, Weiyi 

IEEE Software, vll, n3, pll(9) 

May, 1994 

ISSN: 0740-7459 LANGUAGE: ENGLISH RECORD TYPE: ABSTRACT 

...ABSTRACT: records. However, such precision and speed is difficult and 
costly to achieve where data is dispersed among many relational 
databases located throughout a network. This is especially so if data is 
unstructured. For a distributed relational database system where relations 
are commonly divided up into fragments, there are several recommended 
methods to efficiently act on queries . Among them are the 
identification of local processing opportunities, adoption of a 
f ragment-and-replicate strategy, use of partition -and-replicate technique 
and hashed partitioning . For heterogeneous multidabases , there are 
several factors to consider such as the front end, schema integration and 
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01624867 SUPPLIER NUMBER: 14468976 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

SuperNova version 3.2. (Four Seasons Software Inc. 's database application 
development software) (Software Review) (Evaluation) 

Linthicum, David 
DBMS, v6, nl2, p30{3) 
Nov, 1993 

DOCUMENT TYPE: Evaluation ISSN: 1041-5173 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 1507 LINE COUNT: 00124 

read the database schema of any of the DBMSs it supports and 
automatically construct its own data dictionary . The data dictionary 
always looks the same, no matter what DBMS you are employing. Therefore, 
you can easily apply applications... 

...the database products that SuperNova supports your choice of database 
proves unimportant You can easily develop for multiple database servers 
with virtually no extra learning curve or code changes. This independent 
data dictionary is another strength of the product. 
Distributing Data 

You can distribute your databases anywhere on a network. The 
administrator need. . . 
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01584388 SUPPLIER NUMBER: 13440954 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Creating a CD-ROM: overview of the product field. (CD-ROM authoring and 
data retrieval software packages; includes company directory and related 
article on resources for doing research) (Buyers Guide) 

Banet, Bernard 

Seybold Report on Desktop Publishing, v7, n6, p3(29) 
Feb 1, 1993 

DOCUMENT TYPE: Buyers Guide ISSN: 0889-97 62 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 17 829 LINE COUNT: 014 43 

... but at present pages are not represented as such. 

Graphics and text can be viewed together or separately , depending on 
the platform. Searches can be done across multiple documents on multiple 
databases . Relevance ranking is determined by number of matches within 
a document. A dictionary is provided to identify terms in the database 
and ensure proper spelling. Proximity and Boolean searches are supported. 

Some Hyperlinks are pre-indexed, such as references, citations and a 
table of contents outline. . . 
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Pushing Oracle to the limit: rules of thumb for getting top performance 
from Oracle Server. (Hands On) (Tutorial) 

Butler, Brian; Strehlo, Kevin 
DBMS, v4, nl3, p58(6) 
Dec, 1991 

DOCUMENT TYPE: Tutorial ISSN: 1041-5173 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 5105 LINE COUNT: 00381 



an index leaf page to the data page. 



It is also important to realize that Oracle's query optimizer 
does not use statistical methods in order to pick a articular access path. 
In order to build the query plan, the optimizer first parses the SQL 
statement to determine which database objects are being referenced; second, 
it initiates a query of the data- dictionary to learn about the type, 
content, and location of those objects; and third, it uses a query of the 
data dictionary to see what indexes are available for use in building the 
query plan. Once the optimizer has identified all possible access paths, it 

ranks them according to the rules shown here and chooses the highest 
ranking path : 

1. ROWID = constant is the fastest path to a row. 

2. Indexed columns are better than... 
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DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01448851 SUPPLIER NUMBER: 11278335 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Help rushes to the front; GUI tools rolling out as enabling machines, 
software proliferate. (Data Resource Management: Moving to Server 
Databases) (graphical user interface) (Client/Server Computing supplement 
to Software Magazine) 
Bochenski, Barbara 

Software Magazine, vll, nil, pS18{2) 
Sept, 1991 

ISSN: 0897-8085 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 924 LINE COUNT: 00079 

ABSTRACT: Front-end tools are becoming available for client-server 
architectures that use PC graphics, connect to several server databases 
and provide powerful object-oriented functions for professional 
applications developers. The tools support shared repository and data 
dictionary environments across networks in a client/server mode and are 
project-oriented for teams, according to Digital... 

...the debugger and another examining the output from the program. 
Powersoft's PowerBuilder is one such industrial- strength front-end tool, 
according to Schussel; the program runs under Windows 3.0, but future 
releases will . . . 



26/3, K/27 (Item 11 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01308184 SUPPLIER NUMBER: 07735178 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

The state of play in the world of IBM's DB2 - 2. The benefits of release 
2.2. 

White, Peter 

Computergram International, nl279, CGI10060009 
Oct 6, 1989 

ISSN: 0268-716X LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 1258 LINE COUNT: 00092 

... are they? The answer is query speed and distributed database. Well, 

as the entire world knows distributed database means many things to 
many people, and most of them don't work. However on the query front, IBM 



...query. It has done this by using multiple index access paths or 
multi-index searching. Imagine a query that wants to explore three 
separate fields, dictating a maximum or minumum value to each and slim 
down the records just to those that comply. It is the sort of query 
function that relational databases seemed to be invented for, for instance 
"Find me all the employees that have... 
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DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01185869 SUPPLIER NUMBER: 04711206 

CD-ROM technology takes off. (PC VAR Report supplement to Computer Reseller 
News) 

Trespasz, Nancy 

Computer Reseller News, nl95, pS34(2) 
March 16, 1987 

ISSN: 0893-8377 LANGUAGE: ENGLISH RECORD TYPE: ABSTRACT 

ABSTRACT: The CD-ROM market is growing in strength and showing greater 
potential. Frost and Sullivan says that CD-ROM has greater potential for 
acceptance as... 

...software developers hesitant to write for CD-ROM until more disks are on 
the market. Analysts predict many CD-ROM databases will enter the 
market in 1987,' and Dataquest forecasts an installed base of 60,000 CD-ROM 



...programs; R.R. Bowker's CD-ROM versions of "Books in Print" and 
"Ulrich's International Periodical Dictionary ". Sony Corp. makes a $500 
CDU-100 with a built-in power supply, the $400 CDU-5002 . . . 



26/3, K/29 (Item 13 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01119755 SUPPLIER NUMBER: 00629092 

Enquire Within: Hints on Constructing a Free-Text Data Base. 

Lewis, M. 

Practical Computing, v8, n6, p43-44 
June, 1985 

DOCUMENT TYPE: column ISSN: 0141-54 33 LANGUAGE: ENGLISH 

RECORD TYPE: ABSTRACT 

...ABSTRACT: index can be cut down more by compressing the words into 
6-bit characters and using a thesaurus to group similar terms together, 
looking for the index entries that match the first term in the... 

...creating a sub-list of all entries that match the first word. Some data 
bases offer a weighted -term search wherein a score or weight for each 
word being looked for gives each one a priority. The articles are then 
ranked according to the combined score of the words it contains in 
descending order. Fuzzy matching and selective dissemination are other 
functions used in search procedures. An example of an inverted free-text 
data base is included. 



26/3, K/30 (Item 1 from file: 674) 

DIALOG (R) File 67 4: Computer News Fulltext 

(c) 2003 IDG Communications. All rts. reserv. 

078349 

New dictionary defines cyber -threats 

Byline: Dorte Toft 
Journal: Network World 
Publication Date: October 01, 1999 
Word Count: 593 Line Count: 54 

Text : 

The first official dictionary defining terms used to discuss computer 
systems vulnerabilities has been released. It may be scary reading for. . . 

. . . confusion arising from the fact that each of those bugs goes by many 
different names, registered in many different databases by vendors and 



security organizations, according to Peter Tasker, executive director of 
security and information at Mitre... 

... engineering company based in Bedford, Mass., is the standard bearer of 
the Common Vulnerabilities and Exposures (CVE) dictionary and its 
electronic host (it is available at http://www.cve.mitre.org). Thus far the 
dictionary contains 321 entries, mostly bugs in operating systems such as 
in Windows NT, various Unix flavors and... 

...vendor with most entries in CVE. While SANS ' Northcutt says that the CVE 
will have an educational influence , its authors hope that at least one 
group doesn't learn too much from it. "We did... 

... be accused of providing crackers with information. That is why we have 
limited it to being a dictionary , without cross references, without 
hyperlinks to where the problem is discussed in details," Tasker says. 
Mitre can. . . 



26/3, K/31 (Item 1 from file: 696) 

DIALOG (R) File 696: DIALOG Telecom. Newsletters 
(c) 2003 The Dialog Corp. All rts. reserv. 

00704702 
Growth, Don't Fail Us Now 

Telecoms & Wireless Asia 

December 10, 1999 DOCUMENT TYPE: NEWSLETTER 
PUBLISHER: PYRAMID RESEARCH 

LANGUAGE: ENGLISH WORD COUNT: 4 018 RECORD TYPE: FULLTEXT 

(c) 1999 The Economist Intelligence Unit Ltd. 

TEXT: 

...hit the region at the most precisely opportune time. While "economic 
crisis" has been removed from their vocabularies , opcos throughout the 
region are looking for anything that will return them to the heady days of 
...access -- a key platform for E-commerce — will be limited. 

Education and skills. Improving education standards should rank at the 
top of planners' priorities. The success of India 1 s software industry 
reflects how investors will ... cellular operators are rolling out data 
services as a premium. Short messaging services and transactional 
E-commerce functions — bank account queries and stock purchases — are 
already staples. 

The next leap is from simple data services to slimmed-down . . . is to seize 
upon the Wideband CDMA (WCDMA) platform that it has championed into a 
position of strength in the region and elsewhere. Already the carrier has 
signed MOUs with many carriers in Asia and... 
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DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

03522703 Supplier Number: 47275450 (USE FORMAT 7 FOR FULLTEXT) 
FULCRUM TECHNOLOGIES: Fulcrum announces Java Developers' Toolkit for rich, 
web-based search apps 

M2 Presswire, pN/A 
April 7, 1997 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 94 5 

like searching using Boolean operators, as well as phrase, 
proximity and wildcard searching - along with more advanced functionality 
, such as search -term highlighting, relevance ranking , Intuitive 
Searching (Fulcrum's exclusive similarity searching feature), linguistic 
expansion, natural language searching, and an international thesaurus 
that supports all major European languages. 



* Inherent security- Fulcrum SearchBuilder for Java benefits from the 
security features... 



26/3, K/33 (Item 2 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

02683335 Supplier Number: 45442519 (USE FORMAT 7 FOR FULLTEXT) 
APPLICATIONS SPOTLIGHT: Airpower — An Interactive History of Powered 
Flight 

Multimedia & Videodisc Monitor, vl3, n4, pN/A 
April, 1995 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 1398 

data base. The next priority in the World War I treatment was to 
create an infinitely expandable data base with multiple entry points, 
search strategies, and tools for manipulating data — with maximum use of 
off-the-shelf software to minimize custom. . . 

. . .Windows tie-ins for program architecture and video playback. The data 
base uses a WAIS inverted index search engine with three distinct 
interfaces for access. The first and default interface is an historically 
oriented "Theatre. . . 

. . .and subject header that parses the data base in real time. The second is 
"Applesque" and allows search by title, author, article content, or 
general index. The third interface permits bipolar relational concept 
browsing. For example, the user could search the data base using the 
concepts of morality and chivalry as they relate to the article on. . . 

...von Richthofen, or training of French, English, German, and American 
pilots for the war. In all cases, searches are machine-generated based on 
the number of times selected key words appear in an article as... 

...to article text and hits in the abstract count double. This feature lets 
system administrators insert and weight core or meta concepts that do not 
appear in article text but are nonetheless present. On the other hand, 
machine- generated searches avoid the highly subjective and 
labor-intensive process of manually assigning concepts to each article and 
parametrically " weighting " them on a relative scale. "Airpower *s" 
relational data base is equipped with a full array of informational and 
manipulative tools. Articles are supported by a glossary that explains 
technical aviation terms, while a click on the "place" button brings up a 
map relevant. . . 

...of concept, program tutorials will be built using the "assemble" tool 
and then imported into the information section of the main toolbar. 
Project designers recognized creation as the highest form of understanding 
and devoted considerable. . . 
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DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

02162486 Supplier Number: 44058457 (USE FORMAT 7 FOR FULLTEXT) 

Metal property database available 

Coal & Synfuels Technology, vl4, n33, pN/A 

August 30, 1993 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 115 

scientific and technical databases, recently added METALCREEP, a 
numeric database of creep and rupture stress properties for various 



metals . 

The database contains property information for 14 4 different metals. 
That information includes creep rate, time, strength and strain; exposure 
temperature and time; elongation at break; creep rupture strength ; 
rupture life; and tensile yield strength or ultimate strength . 
METALCREEP also contains high-temperature tensile properties for a wide 
range of steels and aluminum alloys, as well as an on-line thesaurus . 

For more information, call Chemical Abstracts Service, 614-447-3600. 

COPYRIGHT 1993 BY PASHA PUBLICATIONS INC. 

26/3, K/35 (Item 4 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 

01460880 Supplier Number: 41984747 (USE FORMAT 7 FOR FULLTEXT) 
AMDAHL RESPONDS TO AD /CYCLE 
Report on IBM, v8, nl4, pN/A 
April 3, 1991 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 1131 

... software management strategies at Gartner Group (Stamford, Conn.). 

Huron is really an advanced 4GL with support for many databases . Percy 
said the language looks good and the dictionary environment looks strong, 
but Huron is still heavily weighted toward the mainframe environment. 

Early installations will be confined to systems running IBM's MVS 
operating system. . . 
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DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
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01155982 Supplier Number: 40973361 (USE FORMAT 7 FOR FULLTEXT) 

THE STATE OF PLAY IN THE WORLD OF IBM's DB2 - 2 THE BENEFITS OF RELEASE 2.2 

Computergram International, nl278, pN/A 
Oct 8, 1989 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 1193 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 

. . .are they? The answer is query speed and distributed database . Well, as 
the entire world knows distributed database means many things to many 
people, and most of them don't work. However on the query front, IBM... 

. . .query. It has done this by using multiple index access paths or 
multi-index searching. Imagine a query that wants to explore three 
separate fields, dictating a maximum or minumum value to each and slim 
down the records just to those that comply. It is the sort of query 
function that relational databases seemed to be invented for, for instance 
"Find me all the employees that have. . . 



26/3, K/37 (Item 1 from file: 813) 

DIALOG (R) File 813: PR Newswire 

(c) 1999 PR Newswire Association Inc. All rts. reserv. 
1067568 OT012 

Fulcrum Announces Java Developers' Toolkit for Rich, Web-Based Search 
Applications 

DATE: March 12, 1997 08:18 EST WORD COUNT: 953 



.like searching using Boolean operators, as well as phrase, proximity and 



wildcard searching - along with more advanced functionality , such as 
search -term highlighting, relevance ranking , Intuitive Searching (TM) 
(Fulcrum's exclusive similarity searching feature ) , linguistic expansion, 
natural language searching, and an international thesaurus that supports 
all major European languages. 

INHERENT SECURITY - Fulcrum SearchBuilder for Java benefits from the 
security features... 
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DIALOG (R) File 16: Gale Group PROMT (R) 

(c) 2003 The Gale Group. All rts. reserv. 

06969271 Supplier Number: 58831224 (USE FORMAT 7 FOR FULLTEXT) 
Risk management information resources listing. 

Business Insurance, v34, pl3 
Jan 17, 2000 

Language: English Record Type: Fulltext 
Document Type: Magazine/ Journal; Trade 
Word Count: 2113 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT : 

...and property damage caused by power plant explosions and fires, 
featuring a list of methods to building effective safety practices. 
Request item 1113 * A Self-Evaluation for Workplace Violence from Marsh 
Risk Consulting helps organizations prepare for such... 

...Management designation program describes the contents of the class. 
Request item 1201 REINSURANCE * Gill & Roeser Inc.'s Glossary of Selected 
Reinsurance Terms covers the life/health and property/casualty insurance 
industries. Request item 1301 * A. . . 

...item 1405 * Billing Notes outlines ICALM's criteria for understandable 
bills, serving as an example of cost- effective litigation management. 
Request item 1406 * ICALM Claim Audits lists questions one would ask when 
evaluating when and where to conduct. . . 



26/3, K/39 (Item 2 from file: 16) 

DIALOG (R) File 16: Gale Group PROMT (R) 

(c) 2003 The Gale Group. All rts. reserv. 

05401315 Supplier Number: 54555037 (USE FORMAT 7 FOR FULLTEXT) 
So, tell us, what do you really, really want? 
Bird, Julie 

Precision Marketing, p21(l) 
April 14, 1997 

Language: English Record Type: Fulltext 
Document Type: Magazine/ Journal; Trade 
Word Count: 2392 

... is not enshrined in legislation yet, but the argument for positive 

consent from the individual is gathering strength . " The mergers and 
takeovers taking place on the list supplier side are also having an impact 
upon . . . 

...houses are not letting the grass grow under their feet either. Brenda 
Boardman, list brokerage director at Lexicon Marketing Services, believes 
this is good news for clients: "It shows that the industry is moving closer 

...gives an organisation sales and marketing data when they need it, 
increasingly important as organisations adopt highly segmented 
relationship marketing models." The list market is evolving rapidly and 
this is no time for complacency - customers ... sustain growth in the long 
term." Now that list buyers are a lot more clued up about databases , 



many of them are building their own. "There will be a trend for customers 
to buy data for. . . 



26/3, K/40 (Item 3 from file: 16) 

DIALOG (R) File 16: Gale Group PROMT (R) 

<c) 2003 The Gale Group. All rts. reserv. 

04361033 Supplier Number: 4 6395858 (USE FORMAT 7 FOR FULLTEXT) 
Softscape introduces Softscape Explorer Plus, powerful new "Desktop 
Information Manager"; Improves on Windows Explorer by integrating 
advanced search and retrieval with object-based file management. 

Business Wire, p5201056 
May 20, 1996 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 1162 

... of the advanced searching features of the Topic engine, including 

automatic search expansion using an on-line thesaurus , case matching and 
"sounds like" functionality , ensuring that search results are 
comprehensive and ranked by relevancy. Also inherited from Topic is the 
highest level of performance available — QuickFind can search for. . . 
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(c) 2003 The Gale Group. All rts. reserv. 

02010362 Supplier Number: 42581344 (USE FORMAT 7 FOR FULLTEXT) 
Gupta Preps SQLBase NLM 
CommunicationsWeek, pi 
Dec 9, 1991 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 753 

... features discussed in August, which the CTA's Saks said apparently 

have been implemented, included support for databases partitioned 
across multiple disk drives or servers, faster and more efficient 
database queries , and maintenance of data integrity during accesses and 
manipulations by multiple users . 

All versions of SQLBase Server. . . 
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DIALOG (R) File 160: Gale Group PROMT (R) 

(c) 1999 The Gale Group. All rts. reserv. 

00894180 

Personal computers will be the spokes of the corporate information system 
to answer the computing needs of individual users, according to DB 
Holden, Digital Equipment Corp (Stow, MA) . 

Computerworld March 21, 1983 p. D1-ID61 

While personal computers cannot be controlled the way large systems 
have been, the information systems manager can influence and coordinate 
personal computer use by providing users with support in 2 key areas, viz, 
education and. . . 

. .. software tools and languages. However, the wide popularity of personal 
computers has led to the development of many fragmented data bases 
and unintegrated office information systems. To realize the benefits of 
personal computers while reducing the potential risks... 

...efficiently large amounts of information. However, a more traditional DP 
approach is to create a common data dictionary and file structure between 
the data distribution system that contains the extracted corporate data and 



the personal . . . 
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05719244 E.I. No: EI P00125426260 

Title: Restructuring partitioned normal form relations without 
information loss 

Author: Vincent, Millist W. ; Levene, Mark 

Corporate Source: Univ of South Australia, Adelaide, Australia 

Source: SIAM Journal on Computing v 29 n 5 2000. p 1550-1567 

Publication Year: 2000 

CODEN: SMJCAT ISSN: 0097-5397 

Language: English 

Document Type: JA; (Journal Article) Treatment: T; (Theoretical) 
Journal Announcement: 0101W3 

Abstract: Nested relations in partitioned normal form (PNF) are an 
important subclass of nested relations that are useful in many 
applications. In this paper we address the question of determining when 
every PNF relation stored under one nested relation scheme can be 
transformed into another PNF relation stored under a different nested 
relation scheme without loss of information, referred to as the two schemes 
being data equivalent. This issue is important in many database 
application areas such as view processing, schema integration, and schema 
evolution. The main result of the paper provides two characterizations of 
data equivalence for nested schemes. The first is that two schemes are data 
equivalent if and only if the two sets of multivalued dependencies induced 
by the two corresponding scheme trees are equivalent. The second is that 
the schemes are equivalent if and only if the corresponding scheme trees 
can be transformed into the other by a sequence of applications of a local 
restructuring operator and its inverse. (Author abstract) 29 Refs. 

Descriptors: Relational database systems; Query languages; 
Optimization ; User interfaces 

Identifiers: Nest 

Classification Codes: 

723.1.1 (Computer Programming Languages) 

723.3 (Database Systems); 723.1 (Computer Programming); 921.5 
(Optimization Techniques); 722.2 (Computer Peripheral Equipment) 

723 (Computer Software); 921 (Applied Mathematics) ; 722 (Computer 
Hardware ) 

72 (COMPUTERS & DATA PROCESSING); 92 (ENGINEERING MATHEMATICS) 
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05484003 E.I. No: EI P00025039026 

Title: Influence of data set splitting methods on similarity indexing 
performance 

Author: Bai, Xuesheng; Xu, Guangyou; Shi, Yuanchun; Yang, Shiqiang 
Corporate Source: Tsinghua Univ, Beijing, China 

Conference Title: Proceedings of the 2000 'Storage and Retrieval for 
Media Databases 2000 1 

Conference Location: San Jose, CA, USA Conference Date: 
19000126-19000128 

Sponsor: IS and T; SPIE 

E.I. Conference No.: 56354 

Source: Proceedings of SPIE - The International Society for Optical 
Engineering v 3972 2000. p 76-83 
Publication Year: 2000 
CODEN: PSISDG ISSN: 0277-786X 
Language: English 

Document Type: JA; (Journal Article) Treatment: T; (Theoretical) 
Journal Announcement: 0004W1 

Abstract: Similarity indexing is the supporting technology for fast 
content-based retrieval of large media databases , and many similarity 
index structures have been proposed. Compared with the many structures 



present, less attention has been paid to performance evaluation of index 
structures and theoretic analysis on factors influencing index performance. 
In this paper, we attempt to solve part of the problem and focus our 
research on analyzing the influence of data splitting methods.. To give 
a formal definition for index structure performance evaluation, we 
introduce the query distribution probability concept and propose using 
average search cost to evaluate the performance of a similarity indexing 
structure. We choose the simplest case of similarity indexing - 
nearest-neighbor search in our discussion and deduce an expression for 
the average search cost function . Based on analysis of the expression, 
we proposed some criteria that may be useful in index design and 
implementation. Then we extend these conclusions to the general similarity 
indexing case and use these criteria as general rules in index design and 
implementation. Basic thoughts and analysis are detailed, as well as 
experiment results. (Author abstract) 12 Refs. 

Descriptors: ^Indexing (of information); Database systems; Multimedia 
systems; Information retrieval; Data structures; Probability distributions; 
Data reduction; Response time (computer systems) ; Computer systems 
programming 

Identifiers: Data set splitting methods; Similarity indexing; Query 
distribution probability 
Classification Codes: 

903.1 (Information Sources & Analysis); 723.3 (Database Systems); 723.5 
(Computer Applications); 903.3 (Information Retrieval & Use); 723.2 
(Data Processing); 922.1 (Probability Theory) 

903 (Information Science); 723 (Computer Software); 922 (Statistical 
Methods ) 

90 (GENERAL ENGINEERING) ; 72 (COMPUTERS & DATA PROCESSING) ; 92 
(ENGINEERING MATHEMATICS) 
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05085926 E.I. No: EIP98084331475 

Title: Graph-based parallel query processing and optimization 
strategies for object-oriented databases 

Author: Su, Stanley Y.W. ; Huang, Ying; Akaboshi, Naoki 

Corporate Source: Univ of Florida, Gainesville, FL, USA 

Source: Distributed and Parallel Databases v 6 n 3 Jul 1998. p 247-285 
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Abstract: Much work has been accomplished in the past on the subject of 
parallel query processing and optimization in parallel relational 
database systems; however, little work on the same subject has been done in 
parallel object-oriented database systems. Since the object-oriented view 
of a database and its processing are quite different from those of a 
relational system, it can be expected that techniques of parallel query 
processing and optimization for the latter can be different from the 
former'. In this paper, we present a general framework for parallel 
object-oriented database systems and several implemented query 
processing and optimization strategies together with some performance 
evaluation results. In this work, multiwavef ront algorithms are used in 
query processing to allow a higher degree of parallelism than the 
traditional tree-based query processing. Four optimization strategies, 
which are designed specifically for the multiwavef ront algorithms and for 
the optimization of single as well as multiple queries, are introduced. The 
query processing algorithms and optimization strategies have been 
implemented on a parallel computer, nCUBE2; and the results of a 
performance evaluation are presented in this paper. The main emphases and 
the intended contributions of this paper are (1) data partitioning , 
query processing and optimization strategies suitable for parallel 



OODBMSs, (2) the implementation of the multiwavef ront algorithms and 
optimization strategies, and (3) the performance evaluation results. 
(Author abstract) 54 Refs. 
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Abstract: An important feature of database technology of the nineties is 
the use of parallelism for speeding up the execution of complex queries. 
This technology is being tested in several experimental database 
architectures and a few commercial systems for conventional 
select-pro ject-j oin queries. In particular, hash-based fragmentation is 
used to distribute data to disks under the control of different processors 
in order to perform selections and joins in parallel. With the development 
of new query languages, and in particular with the definition of transitive 
closure queries and of more general logic programming queries, the new 
dimension of recursion has been added to query processing. Recursive 
queries are complex; at the same time, their regular structure is' 
particularly suited for parallel execution, and parallelism may give a high 
efficiency gain. We survey the approaches to parallel execution of 
recursion queries that have been presented in the recent literature. We 
observe that research on parallel execution of recursive queries is 
separated into two distinct subareas, one focused on the transitive 
closure of Relational Algebra expressions, the other one focused on 
optimization of more general Datalog queries. Though the subareas seem 
radically different because of the approach and formalism used, they have 
many common features. This is not surprising, because most typical Datalog 
queries can be solved by means of the transitive closure of simple 
algebraic expressions. We first analyze the relationship between the 
transitive closure of expressions in Relational Algebra and Datalog 
programs. We then review sequential methods for evaluating transitive 
closure, distinguishing iterative and direct methods. We address the 
parallelization of these methods, by discussing various forms of 
parallelization . Data fragmentation plays an important role in obtaining 
parallel execution; we describe hash-based and semantic fragmentation. 
Finally, we consider Datalog queries, and present general methods for 
parallel rule execution; we recognize the similarities between these 
methods and the methods reviewed previously, when the former are applied to 
linear Datalog queries. We also provide a quantitative analysis that shows 
the impact of the initial data distribution on the performance of methods. 
(Author abstract) 68 Refs. 
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Abstract: A multidatabase system (MDBS) is a system that integrates the 
operational data of several autonomous database systems and provide a 
uniform interface and control mechanisms to control access to those data. 
To efficiently retrieve and manipulate the data stored in MDBS, a metadata 
dictionary is needed as a repository of essential information for 
reasoning, controlling, and maintaining the retrieval/manipulation 
processes. In this paper we developed a two-level active metadata 
dictionary approach based on logic for building a metadata dictionary , 
query processing, and maintenance in MDBS. The low-level metadata 
dictionaries (LLMDs) keep metadata for each corresponding local database 
in MDBS, respectively. The high-level metadata dictionary (HLMD) 
integrates the metadata about all LLMDs. The evaluation strategy is a 
top-down approach, start with consideration of a query as a global goal 
to be achieved. Unify the query with rules successively to decompose the 
goal into subgoals which can be evaluated against extensional database. 
Then translate these subgoals into corresponding queries against 
underlying DBMSs, respectively. The database integration strategy includes 
two phases: schema translation and schema integration. It is a bottom-up 
approach integrating schema from the underlying database schemas . Update 
may cause inconsistencies in MDBS. We use incremental integrity constraint 
checking to preserve consistency. The semantic query optimization 
evaluation can be partitioned into two phases: compilation phase and 
evaluation phase. During the compilation phase residues are computed and 
associated with deductive rules through partial subsumption algorithm. In 
evaluation phase, redundant residues are eliminated and then translate it 
into query against underlying DBMS. (Author abstract) Ref s . 
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DNA/protein sequence comparison, usually organized as a database 
search, is a very powerful tool in modern molecular biology. In recent 
years, the rapid growth of sequence databases in their size as well as in 
their number poses demands for efficient programs to search these 
databases. In this thesis a distributed system capable of performing 
sequence searches on multiple biological databases simultaneously has 
been designed, implemented and tested. 

The two-phase nature of FASTA algorithm makes it the algorithm of the 
choice to be modified for our distributed system. The system is built on a 
three-tier architecture to support a flexible, expendable, and most 
importantly, user transparent server network. The system is capable of 
searching multiple homogeneous and heterogeneous databases in a single 
query. Also, it can handle concurrent multiple client connections. 

In summary, the work accomplished in this thesis has demonstrated that 
the performance of sequence queries on multiple biological databases 
can be significantly improved if a distributed algorithm is used, compared 
to running uncoordinated parallel searches on these individual databases. 
It also shows that the usability of existing biological databases and 
database search programs can be greatly enhanced if multiple databases 
can be queried simultaneously, as one logical database, because users 
obtain the search results in one compiled report, which is not available if 
they run the searches separately on individual databases. Moreover, 
this thesis demonstrates that the Client/Server computing model used in 
biological database queries can greatly expand the possibilities to build a 
centralized biological data warehouse to facilitate multiple remote client 
requests through the Internet. 
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Databases increasingly integrate different types of information such 
as multimedia data. As a result, it is becoming necessary to support 
efficient storage and retrieval of multi-dimensional data. In several 
modern database applications, both the dimensionality and the amount of 
data that needs to be processed are increasing rapidly. Therefore, it is 
important to develop techniques that overcome the scalability and the 
dimensionality problems of multi-dimensional data sets. Since the amount of 
data is large, it is crucial to develop techniques that exploit parallelism 
in large-scale databases. In this context, we propose partitioning and 
declustering techniques for multi-disk architectures. Several effective 
solutions for the high dimensionality problem are also proposed: access 
structures for efficient searching , and dimensionality reduction 
techniques to remove the curse of dimensionality. In particular, we propose 
a compression based index structure, a clustering based approximate search 
technique, and a dimensionality reduction technique using inner product 
approximations. Finally, we discuss two new types of queries and propose 
efficient techniques to process them. Extensive experimental evaluation of 
all presented techniques has been performed and comparison with other 
state-of-the-art approaches is presented. 
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Similarity searches in sequence databases are important in many 
application domains such as information retrieval, data mining, and 
clustering. Although sequential scanning can be used to perform similarity 
searches, it may require enormous processing time over large sequence 
databases . Recently, several indexing techniques have been proposed to 
speed up the processing of similarity searches. 

Most of the previous techniques use the Euclidean distance metric as a 
similarity measure. However, in many applications, the sampling rates and 
the lengths of sequences may be different, making it difficult or 
impossible to use the Euclidean distance metric. In the area of speech 
recognition, this problem has been approached using the similarity measure, 
called the time warping distance, which allows sequences to be stretched or 
compressed along the time axis. 

In this dissertation, we investigate a set of indexing techniques for 
the fast retrieval of similar (sub) sequences of different lengths or 
different sampling rates. The goal of our approach is to achieve the high 
search performance without missing any qualified answers. 

We first propose a whole sequence searching method, which extracts a 
time-warping invariant feature vector from each sequence and uses a 
lower-bound time warping distance function to compute the distance of any 
two feature vectors. The proposed method efficiently performs similarity 
search using a multi-dimensional index built on the set of feature 
vectors . 

We then propose a subsequence searching method, which uses a 
disk-based suffix tree as an index structure and employs lower-bound time 



warping distance functions to filter out dissimilar subsequences. To make 
the index structure compact and thus accelerate the query processing, the 
proposed method introduces the categorization and sparse indexing schemes. 

For a database with long data sequences, we propose a segment -based 
subsequence searching scheme which changes the similarity measure from time 
warping to piece-wise time warping in order to reduce the number of 
possible subsequences to be compared. For a database with multi-dimensional 
data sequences such as image sequences and video streams, we extend the 
proposed techniques by introducing the multi-dimensional time warping 
distance function. Finally, we apply the proposed subsequence searching 
techniques to the problem of discovering and matching sequential 
association rules. 
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A controlled medical terminology (CMT) is a collection of concepts (or 
terms) that are used in the medical domain. Typically, a CMT also contains 
attributes of those concepts and/or relationships between those concepts. 
Electronic CMTs are extremely useful and important for communication 
between and integration of independent information systems in healthcare, 
because data in this area is highly fragmented. A single query in this area 
might involve several databases , e.g., a clinical database, a pharmacy 
database, a radiology database, and a lab test database. 

Unfortunately, the extensive sizes of CMTs, often containing tens of 
thousands of concepts and hundreds of thousands of relationships between 
pairs of those concepts, impose steep learning curves for new users of such 
CMTs. In this dissertation, we address the problem of helping a user to 
orient himself in an existing large CMT. In order to help a user comprehend 
a large, complex CMT, we need to provide abstract views of the CMT. 
However, at this time, no tools exist for providing a user with such 
abstract views. One reason for the lack of tools is the absence of a good 
theory on how to partition an overwhelming CMT into manageable pieces . 

In this dissertation, we try to overcome the described problem by 
using a three-pronged approach. (l) We use the power of 
Object-Oriented Databases to design a schema extraction process for large, 
complex CMTs. The schema resulting from this process provides an excellent, 
compact representation of the CMT. (2) We develop a theory and a 
methodology for partitioning a large OODB schema, modeled as a graph, 
into small <italic>meaningful</italic> units. The methodology relies on the 
interaction between a human and a computer, making optimal use of the 
human's semantic knowledge and the computer's speed. Furthermore, the 
theory and methodology developed for the schema-level partitioning are 
also adapted to the object-level of a CMT. (3)  We use purely <italic> 
structural similarities</italic> for partitioning CMTs, eliminating the 
need for a human expert in the partitioning methodology mentioned above. 

Two large medical terminologies are used as our test beds, the Medical 
Entities Dictionary (MED) and the Unified Medical Language System (UMLS) , 
which itself contains a number of terminologies. 
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A data warehouse is a stand-alone repository of information consisting 
of “ interes ting” and “historic” data from several 
, heterogeneous, operational databases , and the size of data warehouse is 
very large and grows over time. Data warehouses are usually dedicated to 
the processing of queries issued by decision support systems (DSS) . The 
response time of DSS queries is typically several orders of magnitude 
higher than the response time of OLTP (OnLine Transaction Processing) 
queries. Since DSS queries are often submitted interactively, techniques 
for reducing their response time are important. 

The caching of query results is one such technique particularly well 
suited to the DSS environment. In this thesis, we present an intelligent 
cache manager for such an environment. The cache manager can lookup queries 
either based on an exact query match or using a <italic>query split 
</italic> algorithm to efficiently find query results which subsume the 
submitted query. The cache manager dynamically maintains the cache content 
by deciding whether a new query result should be admitted to the cache and 
if so, which query results should be evicted from the cache. The decisions 
are aimed at minimizing query response time. The decisions are based on the 
execution cost of each query, the size of each query result, the reference 
frequency to each result, the cost of maintenance of each result due to 
updates of the base tables, and the frequency of updates. Experimental 
evaluation shows that the manager can significantly improve performance 
when compared to similar systems. 

Since Web documents vary in their size, and the cost of their 
materialization depends upon the network delays, a profit based cache 
replacement algorithm can be applied to Web caching. At the same time, the 
cache must guarantee some form of consistency of the cached documents. 
Cache consistency algorithms enforce appropriate guarantees about the 
staleness of the cached documents. We have developed a unified cache 
maintenance algorithm which integrates both cache replacement and 
consistency algorithms. A trace-driven experimental study shows that the 
unified algorithm not only improves the average response time but also 
reduces the significant number of stale documents returned to the clients. 
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Recent technological advances have made it possible to process and 
store large amounts of image/video data. Perhaps the most impressive 
example is the fast accumulation of image data in scientific applications 
such as medical and satellite imagery. The internet is another excellent 
example of a distributed database containing several millions of 
images. However, in order to realize their full potential, tools for 
automated analysis and extraction of information, and for intelligent 
searches in image databases need to be developed. 

We have investigated various techniques which facilitate content-based 
image search and retrieval. A prototype system, called NETRA, which enables 
the search of aerial photographs and natural color images has been 
implemented on the web using the platform independent Java language. A 
distinguishing aspect of this system is its incorporation of a robust 
automated image segmentation algorithm that allows object or region based 
search. Image segmentation significantly improves the quality of image 
retrieval when images contain multiple complex objects. Images are 
segmented into homogeneous regions at the time of ingest into the 
database, and image attributes that represent each of these regions are 
computed. This is the first time that image segmentation and region based 
search have been combined in a robust way and retrieval performance 
demonstrated on a large image database. 

In addition to image segmentation , other important components of the 
system include feature representations for characterizing the color, 
texture, and shape information, an approach to enhancing the retrieval 
performance by learning the appropriate similarity measures in the image 
feature space, and an image thesaurus model for image annotation and 
indexing. NETRA allows users to search by image example. For instance, the 
user can retrieve all images containing "blue sky" by specifying the color 
(blue) and location (upper one-third) information. Images containing snow 
covered peaks can be specified by selecting an example from the database 
and choosing color and texture attributes for search. NETRA can be accessed 
on the web at "http://vivaldi.ece.ucsb.edu/Netra." 
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This thesis presents the design and implementation of a database query 
processing engine that is optimized for access to tertiary memory devices. 
Tertiary memory devices provide a cost-effective solution for handling the 
on-going information explosion. While cheap and convenient, they pose new 
optimization challenges. Not only are tertiary devices three orders of 
magnitude slower than disks, but they also have a highly non-uniform access 
latency. Therefore, it is crucial to carefully reduce and reorder I/O on 
tertiary memory using effective query scheduling, batching, caching, 
prefetching and data placement techniques. 

We make two key modifications to an existing query processing 
architecture to support such aggressive optimizations: The first is a 
scheduler that uses system-wide information to make query scheduling, 
caching and device scheduling decisions in an integrated manner. The second 
is a reorderable executor that can process each query plan in the order in 



which data is made available by the scheduler rather than demand and 
process data in a fixed order, as in most conventional query execution 
engines. The two together provide unprecedented opportunities for 
optimizing accesses to tertiary memory. We have extended the scPOSTGRES 
database system with these optimizations. Measurements on the prototype 
yielded almost an order of magnitude improvement on the scSEQUOIA-2000 
benchmark and on queries over synthetic datasets. 

We explore data placement techniques on tertiary memory devices to 
enable better clustering. This thesis concentrates on data placement issues 
for large multidimensional arrays — one of the largest contributors of data 
volume in many database systems. We discuss four techniques for doing 
this: (1) storing the array in multidimensional "chunks" to minimize the 
number of blocks fetched, (2) reordering the chunked array to minimize seek 
distance between accessed blocks, (3) maintaining redundant copies of the 
array, each organized for a different chunk size and ordering and (4) 
partitioning the array onto platters of a tertiary memory device so as to 
minimize the number of platter switches. Measurements on data obtained from 
global change scientists show that accesses on arrays organized using these 
techniques are often an order of magnitude faster than on the unoptimized 
data . 
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Distributed database technology is expected to have a significant 
impact on data processing in the upcoming years because distributed 
database systems have many potential advantages over centralized systems 
for geographically distributed organizations. Data allocation and query 
optimization are two of the most important aspects of distributed database 
design. Data allocation involves placing a database and the applications 
that run against it in the multiple sites of a network. It is a very 
complex problem consisting of two processes: data fragmentation and 
fragment allocation. Data fragmentation involves the partitioning of each 
relation into a group of fragment relations while fragment allocation deals 
with the distribution of these fragmented relations across the sites of the 
distributed system. Query optimization includes designing algorithms 
that analyze and convert queries into a set of data manipulation 
operations. Both the data allocation and query optimization problems 
are NP-hard in nature and notoriously difficult to solve. We have attempted, 
to combine the two highly interrelated and interactive decision processes 
in data allocation by formulating them as integer programs taking into 
consideration different constraints and under various assumptions. Various 
solution methods are discussed and a new linearization method is 
investigated. We next analyze the query optimization problem and reduce 
it to a join ordering problem. Several heuristics and a genetic algorithm 
have been developed for solving the join ordering problem. Some 
computational experiments on these algorithms were conducted and solution 
qualities compared. The computation experiments show that the suggested 
linearization method performs clearly and consistently better than a 
currently widely used method and that heuristics and genetic algorithms are 
viable methods for solving query optimization problem. 

It is anticipated that the models and solution methods developed in 



this study for data allocation and query optimization in distributed 
database systems may be of practical as well as theoretical use. 
Nevertheless, much more needs to be done to solve the distributed database 
design problems in order to achieve its potential benefits. Our models and 
solution methods can be the starting point for eventual resolution of these 
complex problems in large scale distributed database systems. 
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Much work has been accomplished in the past on the subject of parallel 
query processing and optimization in parallel relational database 
systems. However, little work on the same subject has been done in parallel 
object-oriented database systems. Since the object-oriented view of a 
database and its processing are quite different from those of a relational 
system, it can be expected that techniques of parallel query processing 
and optimization for the latter can be different from the former. In this 
dissertation, we present two parallel architectures, a general framework 
for parallel object-oriented database systems-, several implemented 
query processing and optimization strategies together with some 
performance evaluation results. In this work, multi-wavef ront algorithms 
are used in query processing to allow a higher degree of parallelism than 
the traditional tree-based query processing. Four optimization 
strategies, which are designed specifically for the multi-wavef ront 
algorithms and for the optimization of single as well as multiple queries, 
are introduced and evaluated. A distributed result collection scheme which 
is designed to support retrieval queries is also introduced. Furthermore, 
two parallel architectures, namely, master-slave and peer-to-peer 
architectures are compared. A comparison is also made for two data 
placement strategies, namely, class-per-node vertical partitioning and 
hybrid partitioning . The query processing algorithms, four optimization 
strategies and the distributed result collection scheme have been 
implemented on a parallel computer nCUBE2, and the results of a performance 
evaluation are presented in this dissertation. The main emphases and the 
intended contributions of this dissertation are (1) data partitioning , 
parallel architecture, query processing, query optimization and 
result collection strategies suitable for parallel OODBMSs; (2) the 
implementation of these strategies; and (3) the performance evaluation 
results. 
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One of the primary functions of computers is to store information, 
i.e., to deal with long lived or persistent data. Programmers working with 
persistent data structures are faced with the problem that there are two, 
mostly incompatible, views of structured data, namely data in primary and 
secondary storage. Traditionally, these two views of data have been dealt 
with independently by researchers in the programming language and database 
communities . 

Significant research has occurred over the last decade on efficient 
and easy-to-use methods for manipulating persistent data structures in a 
fashion that makes the secondary storage transparent to the programmer. 
Merging primary and secondary storage in this manner produces a 
single-level store, which gives the illusion that data on secondary storage 
is accessible in the same way as data in primary storage. In complex design 
environments, a single-level store offers substantial performance 
advantages over conventional file or database access. These advantages are 
crucial to unconventional database applications such as computer-aided 
design, text management, and geographical information systems. In addition, 
a single-level store reduces complexity in a program by freeing the 
programmer from the responsibility of dealing with two views of data. 

This dissertation proposes, develops and investigates a novel approach 
for implementing single-level stores using memory mapping. Memory mapping 
is the use of virtual memory to map data stored on secondary storage into 
primary storage so that the data is directly accessible by the processor's 
instructions. In this environment, all transfer of data to and from the 
secondary store takes place implicitly during program execution. The 
methodology was motivated by the significant simplification in expressing 
complex data structures offered by the technique of memory mapping. This 
work parallels other proposals that exploit the potential of memory 
mapping, but develops a unique approach based on the ideas of segmentation 
and exact positioning of data in memory. Rigorous experimentation has been 
conducted to demonstrate the effectiveness and .ease of use of the- proposed 
methodology vis-a-vis the traditional approaches of manipulating structured 
data on secondary storage . 

The behaviour of high-level database algorithms in the proposed memory 
mapped highly parallel environment, especially in systems, has been 
investigated. A quantitative analytical model of computation in this 
environment has been designed and validated through experiments conducted 
on several database join algorithms; parallel multi-disk versions of 
the traditional join algorithms were developed for this purpose. An 
analytical model of the system is extremely useful for data structure and 
algorithm designers for predicting general performance behaviour without 
having to construct and test specific algorithms. More importantly, a 
quantitative model is an essential tool for database subsystems such as a 
query optimizer . 
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In the relational data model a relation is a set of tuples; therefore 
the same tuple cannot exist more than once in a relation. However, in 
practice the need arises for relations with duplicates, called 
multirelations . Many database systems, in coping with duplicates, are 
inconsistent and ill-defined. The first part of this thesis provides a 
theoretic and practical framework for integrating multirelations into 
relational databases. 

We argue that a multirelation contains semantically incomplete 
data, being a vertical section of the complete relation, a relation 
without duplicates. The multirelation constitutes the output columns of the 
complete relation and the rest are called hidden columns. The multirelation 
partially describes the complete relation entities and is meaningful only 
in the context of the complete relation. 

Accordingly, base relations or views should not be extended to 
include multirelations. However, a multirelation serves naturally as query 
output, where often partial information is desired. We define the notion of 
full multirelational expressiveness as any meaningful query with 
multirelational output (a multirelational query) . Such a query specifies a 
complete relation and designates its hidden and output columns. We show how 
any relational query language can be extended to achieve full 
multirelational query expressiveness, and we present a description of its 
extension to the query language QUEL. 

We also show how to use tableau techniques to check equivalence 
among conjunctive multirelational queries and how to minimize such queries. 
In the presence of functional dependencies further query simplification 
is possible using the chase process. The new conversion chase rule is 
introduced which removes hidden columns from the complete relation of the 
query and thus simplifies it. 

The second part of this thesis investigates database 
fd-acyclicity. Acyclic schemes allow evaluation of join-project queries 
using semi join instead of join operations. In the presence of functional 
dependencies some cyclic schemes acquire this property, and we address 
recognizing these schemes. 

We present and prove an fd-acyclicity decision algorithm for an 
important class of cyclic schemes called Acliques and an arbitrary set of 
functional dependencies. We also suggest a decision algorithm for general 
database schemes, based on the construction of the cycle space database 
instance. (Abstract shortened with permission of author.) 
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A processor functioning as a coprocessor attached to a central processing 
complex provides efficient execution of the functions required for database 
processing: sorting, merging, joining, searching and manipulating fields in 
a host memory system. The specialized functional units: a memory interface 
and field extractor/assembler, a Predicate Evaluator, a combined 
sort/merge/ join unit, a hasher, and a microcode control processor, are all 
centered around a partitioned Working Store. Each functional unit is 
pipelined and optimized according to the function it performs, and executes 
its portion of the query efficiently . All functional units execute 
simultaneously under the control processor to achieve the desired results. 



Many different database functions can be performed by chaining simple 
operations together. The processor can effectively replace the CPU bound 
portions of complex database operations with functions that run at the 
maximum memory access rate improving performance on complex queries. 

Descriptors: Computers; Database management systems 

Classification Codes and Description: 5.02 (Computer Systems General); 6.02 

(Bibliographic Search Services, Databases) 
Main Heading: Information Processing and Control; Information Systems and 
Applications 
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A database server with a "shared nothing" system architecture has 
multiple nodes, each having its own central processing unit, primary and 
secondary memory for storing database tables and other data structures, and 
communication channels for communication with other ones of the nodes. The 
nodes are divided into first and second groups that share not resources. 
Each database table in the system is divided into fragments distributed 
for storage purposes over all the nodes in the system. To ensure continued 
data availability after a node failure, a "primary replica" and a "standby 
replica" of each fragment are each stored on nodes in different ones" of the ■ 
first and second groups. Database transactions are performed using the 
primary fragment replicas, and the standby replicas are updated using 
transaction log records. Every node of the system includes a data 
dictionary that stores information indicating where each primary and 
standby fragment replica is stored. 

Descriptors: Client server systems; Computer architectures; Database 

management systems; Multiprocessing 
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Processing); 6.02 (Bibliographic Search Services, Databases) 
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A method and apparatus for compressing dictionary database .information 
is described. The method divides the database information into a number 
of parts which are each conducive to a predetermined compression technique. 
A first part database is formed consisting of all the entry points in the 
dictionary wherein each entry point is associated with a unique word 
number. A second part database is formed consisting of a multiplicity of 
placeholders. A third part database is formed consisting of all the entry 
points of the dictionary in the exact order in which they appear in the 
dictionary . A fourth part database is formed consisting of the definitions 
and usage notes without reference to their text. A fifth part database 
allows retrieval of articles of interest without having to decompress the 
entire dictionary . Compression techniques using multigrams and 
minimum-redundancy codes are selectively applied to the different database 
parts . - 
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A processor functioning as a coprocessor attached to a central processing 
complex provides efficient execution of the functions required for database 
processing: sorting, merging, joining, searching and manipulating fields in 
a host memory system. The specialized functional units: a memory interface 
and field extractor/assembler, a Predicate Evaluator, a combined 
sort/merge/ join unit, a hasher, and a microcoded control processor, are all 
centered around a partitioned Working Store. Each functional unit is 
pipelined and optimized according to the function it performs, and executes 
its portion of the query efficiently . All functional units execute 
simultaneously under the control processor to achieve the desired results. 
Many different database functions can be performed by chaining simple 
operations together. The processor can effectively replace the CPU bound 
portions of complex database operations with functions that run at the 
maximum memory access rate improving performance on complex queries. 

Descriptors: Access; Array processors; Databases; Host computers . 
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This paper reviews the services provided by the ERIC system and suggests 
ways to broaden its scope through coordination with other existing 
education databases. The first of four major sections chronicles the 
evolution of ERIC and explains why it has not been able to exercise true 
bibliographic control over the literature of education. The second section " 

identifies and discusses four stages at which efforts could be made to 
coordinate the various education information databases : (1) coverage, 
acquisition, and selection; (2) processing; (3) finding the right database 
to search; and (4) retrieval. Examples of possible applications are 
included. The third section introduces and discusses the concept of a 
"federation" of education databases which would ensure that all domestic 
educational resources would be available to users through clearly 
delineated channels. It is suggested that ERIC could serve as a focus of 
such an organization. 
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Databases and clearinghouses useful in education are described in this 
publication. section one, "databases," contains one-page summaries of 54 
databases of interest to educators, covering a variety of subjects, such as 
energy and environmental education, psychology, funding sources, language, 
special education, art, child abuse and neglect, and research on early 
childhood and adolescent development. section " two, "clearinghouses," 
consists of one-page summaries of 30 clearinghouses as well as a list of 
the 16 clearinghouses and network components of eric, subject areas covered 
include consumer education, women 1 s equity, adult education, test 
collection, community education, drug abuse, and nutrition education. Each 
database and clearinghouse summary includes such information as the 
acronym; name of database; major subject area (s); date established; 
publication/print journals; thesaurus /search aids; types of source 
documents; forms of retrievable information; and information contact. A 
sample computer search, using one of seven questions, is provided for many 
of the databases 
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Abstract: Nowadays feature vector based similarity search is increasingly 
emerging in database systems. Consequently, many multidimensional data 
index techniques have been widely introduced to the database researcher 
community. These index techniques are categorized into two main classes: SP 
(space partitioning ) /KD- tree-based and DP (data partitioning 
) /R-tree-based. Recently, a hybrid index structure has been proposed. It 
combines both SP/KD-tree-based and DP/R-tree-based techniques to form a 
new, more efficient index structure. However, weaknesses are still existing, 
in techniques above. In this paper, we introduce a novel and flexible index 
structure for multidimensional data, the SH-tree (Super Hybrid tree) . 
Theoretical analyses show that the SH-tree is a good combination of both 
techniques with respect to both presentation and search algorithms. It 
overcomes the shortcomings and makes use of their positive aspects to 
facilitate efficient similarity searches . (36 Refs) 
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Abstract: SchemaSQL is a recently proposed extension to SQL for enabling 
multi- database interoperability. Several recently identified 
applications for SchemaSQL, however, mainly rely on its ability to treat 
data and schema labels in a uniform manner, and call for an efficient 
implementation of it on a single RDBMS. We first develop a logical algebra 
for SchemaSQL by combining classical relational algebra with four 
restructuring operators-unfold, fold, split / and unite-originally 
introduced in the context of the tabular data model by Gyssens et al . , 
(1996), and suitably adapted to fit the needs of SchemaSQL. We give an 
algorithm for translating SchemaSQL queries/ views involving restructuring, 
into the logical algebra above. We also provide physical algebraic 
operators which are useful for query optimization . Using the various 
operators as a vehicle, we give several alternate implementation strategies 
for SchemaSQL queries/views. All the proposed strategies can be 
implementation non-intrusively on top of existing relational DBMS, in that 
they do not require any additions to the existing set of plan operators. We 
conducted a series of performance experiments based on TPC-D benchmark 
data, using the IBM DB2 DBMS running on Windows NT. In addition to showing 
the relative tradeoffs between various alternate strategies, our 
experiments show the feasibility of implementing SchemaSQL on top of 
traditional RDBMS in a non-intrusive manner. Furthermore, they also suggest 
new plan operators which might profitably be added to the existing set 
available to relational query optimizers , to further boost their 

performance. (32 Refs) 
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Abstract: One of the common queries in many database applications is 
finding approximate matches to a given query item from a collection of data 
items. For example, given an image database, one may want to retrieve all 
images that are similar to a given query image. Distance-based index 
structures are proposed for applications where the distance computations 
between objects of the data domain are expensive (such as high-dimensional 
data) and the distance function is metric. In this paper we consider using 
distance-based index structures for similarity queries on large. metric 
spaces. We elaborate on the approach that uses reference points (vantage 
points) to partition the data space into spherical shell-like regions in 
a hierarchical manner. We introduce the multivantage point tree structure 
(mvp-tree) that uses more than one vantage point to partition the space 
into spherical cuts at each level. In answering similarity-based queries, 
the mvp-tree also utilizes the precomputed (at construction time) distances 
between the data points and the vantage points. We summarize the 
experiments comparing mvp-trees to vp-trees that have a similar 
partitioning strategy, but use only one vantage point, at each level and 
do not make use of the precomputed distances. Empirical studies show that 
the mvp-tree outperforms the vp-tree by 20% to 80% for varying query ranges 
and different distance distributions. Next, we generalize the idea of using 
multiple vantage points and discuss the results of experiments we have made 
to see how varying the number of vantage points in a node affects search 
performance and how much is gained in performance by making use of 
precomputed distances. The results show that, after all, it may be best to 
use a large number of vantage points in an internal node in order to end up 
with a single directory node and keep as many of the precomputed distances 
as possible to provide more efficient filtering during search 
operations. Finally, we provide some experimental results that compare 
mvp-trees with M-trees, which is a dynamic distance-based index structure 
for metric domains. (24 Refs) 
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Abstract: The issue of interoperability among multiple autonomous 
databases has attracted a lot of attention from researchers. The past 
research on this issue can be roughly divided into two main categories: 
the tightly-integrated approach that integrates databases by building an 
integrated schema; and the loosely-integrated approach that achieves 
interoperability by using a multidatabase language. Past efforts focus on 
the issues in the first approach. The problem with the first approach is 



that it lacks a convenient representation of the integrated schema at the 
system level and a sound mathematical basis for data manipulation in a 
multidatabase system. We propose to use hyperrelations as a powerful and 
succinct model for the global level representation of heterogeneous 
database schemas. A hyperrelation has the structure of a relation, but its 
contents are the schemas of the semantically equivalent local relations in 
the databases. With this representation, the metadata of the global 
database, local databases and the data of these databases are all 
representable by using the structure of a relation. The impact of such a 
representation is that all the elegant features of relational systems can 
be easily extended to multidatabase systems. A hyperrelational algebra is 
designed accordingly. This algebra is performed at the multidatabase 
systems (MDBS) level such that query transformation and optimization is 
supported on a sound mathematical basis. (52 Refs) 
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Abstract: To an increasing extent, applications demand the capability of 
retrieval based on image content. As a result, large image database systems 
need to be built to support effective and efficient accesses to image data 
on the basis of content. In this process, significant features must first 
be extracted from image data in their pixel format. These features must 
then be classified and indexed to assist efficient retrieval of image 
content. However, the issues central to automatic extraction and indexing 
of image content largely remain an open problem. Tools are not currently 
available with which to accurately specify image content for image database 
uses. In this paper, we investigate effective block-oriented image 
decomposition structures to be used as the representation of images in 
image database systems. Three types of block-oriented image decomposition 
structures, namely, quad-, quin- and nona-trees, are compared. In analyzing 
and comparing these structures, wavelet transforms are used to extract 
image content features. Our experimental analysis illustrates that 
nona-tree decomposition is the most effective of the three decomposition 
structures available to facilitate effective content-based image retrieval. 
Using the nona-tree structure to represent image content in an image 
database / various types of content-based queries and efficient 



image retrieval can be supported through novel indexing and searching 
approaches. We demonstrate that the nona-tree structure provides a highly 
effective approach to supporting automatic organization of images in large 
image database systems. (28 Refs) 
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Abstract: An approach to query processing in object oriented distributed 
database systems is proposed. Distributed PIOS is a server that supports an 
object-oriented data model, physical data independence (i.e. different 
strategies for storing class hierarchies, grouping, horizontal and vertical 
partitioning of objects), and fragmentation transparency (i.e. 
transactions are not aware of the distribution of database fragments on 
several nodes of a computer network) . The problem of the optimization 
of distributed queries (i.e. determining which data must be accessed at 
which site and which data must be transmitted among sites) is the focus of 
the paper. (5 Refs) 
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Abstract: Query processing is a very important issue in distributed 
databases . Many algorithms have been proposed to process distributed 
queries efficiently . However, most of the algorithms use oversimplified 
cost models and ignore the impact of work load generated by other 
applications. As a result, load balancing is difficult to achieve in a real 
environment. We provide an adaptive scheme to do load . balancing 
effectively. The scheme takes into account an environment in which the load 
at different sites varies. The partition and replicate strategy algorithm 
is used to explain how to achieve load balancing in a multi-user 
environment. The scheme also has learning capability such that the 
parameters of cost estimation functions can be adaptively adjusted as the 
environment changes. (20 Refs) 
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Abstract: Signature files have been studied extensively, as an access 
method for textual databases . Many approaches have been proposed for 
searching signatures files efficiently . However, different methods make 
different assumptions and use different performance measures, making it 
difficult to compare their performance. In this paper, we study three basic 
methods proposed in the literature, namely, the indexed descriptor file, 
the two-level superimposed coding scheme, and the partitioned signature 
file approach. The contribution of this paper is two-fold. First, we 
present a uniform analytical performance model so that the methods can be 
compared fairly and consistently. The analysis shows that the two-level 



superimposed coding scheme, if stored in a transposed file, has the best 
performance. Second, we extend the two-level superimposed coding method 
into a multilevel superimposed coding method, we obtain the optimal* number ' 
of levels for the multilevel method and show that for databases with 
reasonable size the optimal value is much larger than 2, which is assumed 
in the two-level method. The accuracy of the analytical formula is 
demonstrated by simulation. (21 Refs) 
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Abstract: Numerous structures for database query optimizers have 
been proposed. Many of those proposals aimed at automating the construction 
of query optimizers from some kind of specification of optimizer 

behavior. These specification frameworks do a good job of partitioning 
and modularizing the kinds of information needed to generate' a "query 
optimizer . Most of them represent at least part of this information in a 
rule-like form. Nevertheless, large portions of these specifications still 
take a procedural form. The contributions of this work are threefold. We 
present a language for specifying optimizers that captures a larger portion 
of the necessary information in a declarative manner. This language is in 
turn based on a model of query rewriting where query expressions carry 
annotations that are propagated during query transformation and planning. 
This framework is reminiscent of inherited and synthesized attributes for 
attribute grammars, and we believe it is expressive of a wide range of 
information: logical and physical properties, both desired and delivered, 
cost estimates, optimization contexts, and control strategies. Finally, we 
present a mechanism for processing optimizer specifications that is based 
on compile-time reflection. This mechanism proves to be succinct and 
flexible, allowing modifications of the specification syntax, incorporation 
of new capabilities into generated optimizers, and retargeting the 
translation to a variety of optimization frameworks. We report on an 
implementation of our ideas using the CRML reflective functional language 
and on optimizer specifications we have written for several query algebras. 
(13 Refs) 
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Abstract: Dialog automated information systems with access to both local 
data bases and remote access to information resources of integrated centers 
are an efficient form of information service. Despite the broad 
capabilities in accumulating and processing information files and data 

bases of various structures, dialog systems have run into conflicts and 
problems, especially the contradiction between search in the dialog mode 
and manual indexing of requests in thesauri and rubricators; the 
full-fledged syntactic tools in dialog information languages and the 
absence of semantic procedures to reveal the relationships between the 
search terms; and the broad capabilities of dialog information systems in 
organizing retrieval files by selecting natural language segments from 
the documents, on the one hand, and the absence of any practical strategies 
for search in such files, on the other. The authors describe an 
experimental system of user adjustment to the vocabulary ' environment of a 
dialog information system, with preliminary semantic and statistical 
searches. A detailed study of the interaction between the user and the 

thesaurus -free vocabulary of the data base is based on a probabilistic 
model of statistical search. (14 Refs) 
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ABSTRACT: This paper proposes a new method of a VQ (Vector 

Quantization) -based preprocessor for reducing the amount of computation 
in large vocabulary isolated word recognition. A speech wave is 
analyzed by time functions of both cepstrum coefficients and their 
short-time regression coefficients, and a universal VQ codebook for 
these time functions is constructed based on a multiple speaker- 
multiple word database . Next, a separate codebook is designed as a 
subset of the universal codebook for each word in the vocabulary . 
These word-based codebooks are used as a front-end preprocessor to 
eliminate word candidates whose distance scores are large. A dynamic 
time-warping processor based on a word- dictionary in which each word 
is represented as a time-sequence of the universal codebook elements ( 
SPLIT method) then resolves the choice among the remaining word 
candidates. Effectiveness of this method has been ascertained by 
recognition experiments using a database consisting of words from a 
vocabulary of 100 Japanese city names uttered by 20 male 
speakers . (author abst.) 
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Detailed Description 

Claims 

Fulltext Word Count: 150517 
English Abstract 

A system, method, and article of manufacture are described for providing 
a self-describing stream-based communication system. Messages are sent 
which include data between a sending system and a receiving system. 
Meta-data is attached to the messages being sent between the sending 
system and the receiving system. The data of the messages sent from the 
sending system to the receiving system is translated based on the 
meta-data. The meta-data includes first and second sections. The first 
section identifies a type of object associated with the data and a number 
of attribute descriptors in the data. The second section includes a 
series of the attribute descriptors defining elements of the data. 

French Abstract 

L' invention concerne un systeme, un procede et un article de fabrication 
destines a constituer. un systeme de communication a base d'un flux 
d ' autodescripteurs . Des messages comprenant des donnees sont envoyes, 
entre un systeme expediteur et un systeme recepteur. Des metadonnees sont 
attachees aux messages en cours d 1 envoi entre le systeme expediteur et le 
systeme recepteur. Les donnees des messages envoyes du systeme expediteur 
au systeme recepteur sont traduites d'apres les metadonnees, lesquelles 
comprennent des premiere et seconde sections. La premiere section 
identifie un type d'objet associe aux donnees et un nombre de 
descripteurs d'attributs presents dans celles-ci. La seconde section 
comprend une serie de descripteurs d'attributs definissant des elements 
des donnees. 

Legal Status (Type, Date, Text) 

Publication 20010308 A2 Without international search report and to be 



republished upon receipt of that report. 
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19th month from priority date 
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Fulltext Availability: 
Detailed Description 

Detailed Description 

. . . of these components will require special attention 
because of the functional demands of the building?" 
Oxford English Dictionary Definition. 

The conceptual structure and overall logical organization of a computer 
or computer-based systeinf rom the point. . .Transaction Management Services 
coordinate transactions across one or more resource managers either on a 
single machine or multiple machines within the network. Transaction 
Management Services ensure that all resources for a transaction are 
updated, or . . . 

...of an update failure on any one resource, all updates are rolled back. 

This services that allow multiple applications to share data with 
integrity. The transaction management services help implement the notion 
of a transaction. a printer name is passed. For status update, the new 
status code is passed. 

Request Report. The Request Report function is responsible for 
processing report request messages written to the report process queue. 
It creates a new... single report to single or multiple destinations. 

16. Destination Rationalization: For some systems, it is possible that 
multiple copies of a' report will be sent to the same site — to several 

different users, for example. In these cases, it is highly desirable to 
have the report architecture recognize ... system) ; and (2) Message-based 
architecture (relying on specific mail systems for much of the 
functionality) versus Database -based. 

What is the nature of the workflow? 
246 

How an organization approaches the management of its... of Capability 
Release Design and into Capability Release Build and Test 3610, Business 
Components are transformed into Partitioned Business Components based 
on the realities of the technical environment. These constraints include 
distribution requirements, legacy integration integrity of the Business 
Component model, a given Partitioned Business Component should descend 
from one and only one Business Component. 

In other words, it should never... 

...the Business Component level. Also at this time, the project team 
designs the internal workings of each Partitioned Business Component. 
This could mean the Engineering Components that make up the Partitioned 
Business Component, the "wrapper" for a legacy or packaged system, and 
other code. 

In Capability Release Build and Test, Partitioned Business Components 
are built and tested. The build process varies depending upon the 
technology chosen to build the internal workings of each Partitioned 
Business Component. Among the many tests that are perfortned during this 
262 

stage, the component, assembly, and performance tests are impacted the 
most by this style of development. A component test addresses a 
Partitioned Business Component as a single unit by testing its 
interfaces and its internal workings, while an assembly test addresses 
the interactions between Partitioned Business Components by testing 



broader scenarios. The performance test is impacted primarily by the 
techniques one would. . . 

.to resolve the various performance issues. 

For example, it ! s common to run multiple copies of a Partitioned 
Business Component across multiple servers to handle a greater 
transaction volume. 

In Deployment 3612, the Partitioned Business Components are packaged 
and deployed as part of the application into the production environment. 
The application parameters and the manner in which the Partitioned 
Business Components are distributed are tweaked based- on how well the 
application performs. 

Well designed Business Components ... would be logical to conclude that th 
two types of Business Components translate to two types of Partitioned 
Business Components, but a small adjustment is required. Entity-centric 
Business Components translate directly to Business Entity. . . 

. Component . 

Figure 38 illustrates the relationship between the spectrum of Business 
Components 3800 and the types of Partitioned Business Components 3802. 
Business Entity Components 3804 and Business Process Components 3806 are 
straightforward. The former is... 

.Figure 40 is a diagram of the Eagle Application Model which illustrates 
how the different types of Partitioned Business Components might 
interact with each other. Business Entity Components 4002 and Business 
Process Components 4004 typically. . . 

.while User Interface* Components 4 006* typically reside on a client. 

Figure 41 illustrates what makes up a Partitioned Business Component 
410O. ' As long as a component does what it's suppose to do, it doesn... 

.benefit of encapsulation. Classifying this code is a different matter. 
Some code 4102 is specific to the Partitioned Business Component. Othe 
code is more widely reusable, both f tinctionally and technically; this i 
where one finds . . . 

.for designing and building this code. 

Engineering Components are physical building blocks used in the assembly 
of Partitioned Business Components. They are independent pieces of 
software that provide f tinctionality that is generally useful across a. . 
increasing speed to market and the ability to 
cope with change (0.7 probability). 1 ' 

Business Components and Partitioned Business Components represent a 
major improvement in design capability-some might argue the first major 
change in. . . 

.for this breakthrough. 

Business Components model entities and processes at the enterprise level 
and they evolve into Partitioned Business Components that are 
integrated into applications that operate over a network. Consequently, 
they serve as an. . . 

.to the application's overall maintainability.. 

To manage the complexity of a large problem, it must be divided into 
smaller, coherent parts. 

Partitioned Business Components provide an excellent way to divide 
and conquer in a way that ties the application to the business domain. 



They provide the ability. 



...reusable in multiple contexts. On the other end of the spectrum, objects 
are too small to effectively divide and conquer; there are simply too 
many of them. 

Partitioned Business Components provide a greater emphasis on 
application layering-a well known, but of temneglected concept in 
application development. 

Partitioned Business Components are application building blocks. As an 
application modeling. t 

tool, they depict how various elements of... focus appropriately on the 
high-level reuse enabled by processes, patterns, and frameworks. 

Although Business Components and Partitioned Business Components 
represent a significant breakthrough in design capability, the 
architectural frameworks to support this breakthrough are... 

...Business Components are the same as applications, but in fact, 

applications are assembled from Business Components (or Partitioned 
Business Components to be more accurate) . A typical application might 
have ten to twenty Business Components. On... 5 and other requirements. 
Look for behaviors that will be supported by the application. In other 
This section addresses several frequently asked questions that more 
broadly apply to the physical implementation of component- and object... 
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English Abstract 

A universal epistemological machine (U.M.) enables arbitrary synthetic 
forms of existence (that is, thinking machines) known as androids, which 
know and perceive the world as do human beings. The U.M. embodies 
transformations of an extended existential universe of human being, and 
comprises means for transforming, representing, enbodying, translating 
and realizing a plurality of universal forms. These universal forms 
comprise universal objects in the form of physical embodiments of 
universal knowledge structures. The U.M. comprises a plurality of 



* epistemic instances comprising the universal objects and universal 
transformations of those universal objects, expressed in a universal 
grammar, which allows all human knowledge to be enabling media for the 
U.M. 

French Abstract 

Une machine epistemologique universelle (M.U.) permet de creer des 
formes de vie synthetiques arbitraires ( c 1 est-a-dire des machines 
pensantes) connues sous le nom d'androides qui connaissent et percoivent 
le monde comme le font les etre humains . La M.U. integre des 
transformations d'un univers existentiel etendu d'etres humains et 
comprehd des moyens permettant de transformer," representer, integrer, 
traduire et realiser une pluralite de formes universelles . Ces formes 
universelles comprennent des objets universels se presentant sous forme 
de representations physiques de structures de connaissances universelles. 
La M.U. comprend une pluralite d* instances epistemiques comprenant ces 
objets universels et les transformation universelles de ces objets 
universels, exprimees dans une grammaire universelle qui permet a toute 
la connaissance humaine d'etre un support d 1 integration pour la M.U. 

Fulltext Availability: 
Claims 

Claim 

... sentence. Since our thoughts transform in accordance with epistemic 
instance, we construct sentences episternically , not in subject- 
predicate structure. In order to construct an English sentence 
naturally, one must ignore the grammar of 
English-the. . . 
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ABSTRACT 

PURPOSE: To contrive efficient proofreading by using a concordant adverb 
dictionary storing sets of adverbs calling for the concordance with the 
expression of the predicate and the classification of significance of 
the expression of the predicate requested by the adverb so as to check 
the usage of the word in concordance with each other. 

CONSTITUTION: A concordant adverb stored in a concordant adverb dictionary 
2 is retrieve from a sentence in a sentence storage means 1 by a concordant 
adverb retrieval means 3, the result is outputted to a predicate 

retrieval means 5, the classification of significance of the expression 
of the predicate called for by the adverb is extracted from the 
dictionary 2 and given to a significance comparison means 7. The means 5 
retrieves the predicate in concordance with the adverb from the sentence 
in the storage means 1, outputs the expression of the detected predicate to 
a predicate expression deciding means 6 and outputs the location of the 
adverb and predicate to an error suggestion means 8. The means 6 references 
the predicate expression dictionary 4 as to the expression of the sent 

predicate to decide the classification of significance and outputs the 
result to the means 7 . The means 7 compares the inputted two 
classifications of meaning, and when they are dissident, the means 8 
suggests the adverb and predicate as an error. 
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Multiagent type integrated database system for query processing, refers * 
information about predicates for converting query which is thrown into 
integrated database system, into several query sets 
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(SHIN-I); USHIJIMA K (USHI-I) 

Inventor: NISHIZAWA I; SHINTANI T; USHIJIMA K 
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US 20020120618 Al 20 G06F-007/00 
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Abstract (Basic) : US 20020120618 Al 

NOVELTY - An expansion unit (9) refers to information about 
predicates used in query processing and the degree of the connections 
of the predicates that are stored in a predicate dictionary (4) for 
converting a query which is input into the integrated database system 
(1), into several query sets. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is included for 
recorded medium storing query processing program. 

USE - For query processing e.g. for DNA sequence analysis 
application . 

ADVANTAGE - By referring to the information about the predicates 
for converting the query input into the database system into several 
query sets, the cost of the query processing is minimized and accurate 
query results are obtained. 

DESCRIPTION OF DRAWING (S) - The figure shows the block diagram of 
the integrated database system. 

Integrated database system (1) 
Predicate dictionary (4) 

Expansion unit (9) 
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Abstract (Basic) : JP 2001273297 A 

NOVELTY - An operation log acquisition section (110) stores 
search log indicating search condition of various databases . A 
preference database extraction section (130) extracts search 
objective database, and a search device (300) displays the search 
objective database sequentially, based on the search conditions. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
recording medium. 

USE - For searching database in Internet applications. 



ADVANTAGE - Searches suitable database efficiently within a 
short time. 

DESCRIPTION OF DRAWING (S) - The figure shows the block diagram of 
the search system. (Drawing includes non-English language text) . 
Operation log acquisition section (110) 
Preference database extraction section * (130) 
Search device (300) 
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Abstract (Basic) : US 5920857 A 

NOVELTY - Transactions (T) are split into sub-transactions (Tn) 
at transaction originator server and are executed using a two phase 
commit protocol. Logs of committed sub-transactions (CL) and 
sub-transactions (WL) ready to commit are maintained and verified at 
each server for each incoming transaction (Tn) . 

DETAILED DESCRIPTION - During transaction execution at the server, 
a logical time is incremented at each server machine. A transaction 
T(L,D,V) is accumulated at client in three sets with an insert set (I), 
delete set (D) and a verify set (V) comprising set of data items to be 
inserted, deleted and set of descriptions (P ) which - contains 
information that identifies data retrieval operations performed by the 
client with respect to server, the particular server subjected to the 
client data retrieval operations and a logical time stamp at the 
particular server. A transaction (T) is delivered from a client to the 
selected server which is being designated as the transaction originator 
server. An INDEPENDENT CLAIM is also included for query optimization 
method. 

USE - For multi server database system comprising multiple 
client and multiple server and for B-tree. 

ADVANTAGE - The computational load to the server is reduced and a 
fine granularity is implemented which improves the overall server 
performance. The use of synchronized physical clocks is eliminated by 
using logical clocks. 

DESCRIPTION OF DRAWING (S) - The figure shows the work of an 
optimistic concurrency control algorithm with logical time stamps. 

pp; 8 DwgNo 2/2 

Title Terms: CONCURRENT; CONTROL; METHOD; MULTI; SERVE; DATABASE; SYSTEM; 
TREE 

Derwent Class: T01 

International Patent Class (Main) : G06F-017/30 * 
File Segment: EPI 



18/5/5 (Item 4 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2003. Thomson Derwent. All rts. reserv.. 

012408291 **Image available** 
WPI Acc No: 1999-214399/199918 
XRPX Acc No: N99-157802 

SQL queries optimization method in relation database management 

system 

Patent Assignee: NCR CORP (NATC ) 
Inventor: KRAUS T B; RAMESH B; WALTER T A 
Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5884299 A 19990316 US 97795114 A 19970206 199918 B 

Priority Applications (No Type Date) : US 97795114 A 19970206 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5884299 A 11 G06F-017/30 

Abstract (Basic) : US 5884299 A 

NOVELTY - The query is examined. to .determine if.it includes one or 
more aggregation operating on rows of a table in relational database 
Several local aggregate result rows are created by aggregating rows 
of table by aggregation operation. The aggregation result rows are 
redistributed to several global aggregation operations to create 
several global aggregate result rows. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
query optimizing apparatus . 

USE - For optimizing SQL queries in relation database 
management system using aggregate or grouping function. In MPP compute 
system. 

ADVANTAGE - The queries are splitted into sub-queries by a singl 
processor in order to minimize the overhead associated with the 
processing of the entire query. The sub-queries are performed 
simultaneously on a single processor using a multitasking operating 
environment . 

DESCRIPTION OF DRAWING (S) - The figure represents flow chart for 
the execution of the global aggregation in SQL queries optimization 
method . 
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Abstract (Basic) : JP 8272806 A 

The system has a database storing part (11) in which the database 
is searched. A search type is given as the input to an input part (15) . 
A division part (21) divides the search type to a single term type. 
Two search parts, an index search part (13) and a whole sentence search 
part (14) are also provided. 

An assignment part (32) assigns the single term type to both the 
search parts respectively. An arithmetic part (33) carries out logical 
operation of the results obtained from the index search and the whole 
sentence search parts, based on the search type and gives an output to 
a display part (16). 

ADVANTAGE - Searches database efficiently , using multiple 
search techniques . ' 
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Abstract (Basic) : WO 9618159 A 

The system has an extensible query architecture which allows an 
applications programmer to integrate new query models into the system 
as desired. The architecture is based on an abstract base class of 
query nodes, or code objects that retrieve records from the database. 
Specific sub-classes are derived from the base class. Each query node 
class includes a search function that iteratively searches the 
database for matching records. Query node objects are instantiated by 
associated node creator class objects. 

A parser is used to parse a search query into its components, 
including nested search queries used to combine various query models. 
The parser determines the particular search operator keywords and the 
node creator object. The node creator objects return pointers to the 
created query nodes. 

ADVANTAGE - Allows parser to assemble complex hierarchical query 
nodes that combine multiple query models. 
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RECORD TYPE: Review 

REVIEW TYPE: Product Analysis 

GRADE: Product Analysis, No Rating 

Information Builders' EDA/ SQL, Merant's DataDirect, Wall Data's Cyberprise 
Portal, Informix Software's universe DB, and Sun Microsystems 1 Java are 
middleware products that take disparate • approaches to giving Web developers 
easier data access. EDA is a seasoned product suite that can gain access to 
88 separate data sources, optimize the query against the source, and 
manage data from multiple sources into one report and back to the 
requester . DataDirect links to seven databases , connects Java server 
applications to data sources through Java Database Connectivity (JDBC) 
drivers, and communicates with Microsoft-based sources, including Lotus 
Notes. Cyberprise Portal, a Windows NT-based portal-building framework, 
includes complete host and database access to many database products. 
uniVerseDB is a developers 1 database / with an interface for access to 
RDBMSes and other databases . Many older data stores were constructed 
before the need to support mobile users was considered, so new types of 
data access are required to link such remote users to their home offices. 
For instance, ThinWeb's Java-enabled product bases data access on Java's 
cross-platform portability and an automated structure for supplying access 
in Web applications. Vendors competing in the market include Attachmate, 
IBI, Merant, PLATINUM technology (now Computer Associates), Sybase, Cross 
Access, OpenLink Software, and thinWeb.com. 
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Confusing claims for the advantages of online CD-ROM database searching 
versus separately purchased CD-ROM databases (which are sometimes 
available in lower-cost paper versions) do nothing to guide users in 
choosing a product. Users must assess ease of use; power; storage and 
delivery; quality; functionality; speed; appeal; and total cost of using 
the media. In selecting CD-ROM databases , users have many choices for 
some databases (for example, cinema directories), and none for others; 
subsets of large databases are also available. New CD-ROM databases 
likely to hit the market include popular magazines, new magazines, niche 
offerings, and reference books. Many products will offer full- search 
functions and images, and many will be cross-licensed (available from more 
than one vendor) . Although only KnowledgFinder software provides real 
natural language functions, this valuable technology may be more widely 
implemented. 
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Some vendors are beginning to experiment with combining parallel query 
across instances, with parallel query within instances, in a 
shared-nothing architecture. Each node in this architecture would be a 
multiprocessor node, and queries would be broken into parts. Each part of 
a parallelized query is passed to a DBMS instance, where it is 
parallelized again to make the best use of each node 1 s hardware. This 
concept of two-level parallelism is appearing in many second-generation 
database middleware products. The basis for parallel query optimization 

is breaking SQL statements into multiple steps. This depends on how well 
the data is partitioned across multiple disks, however. 
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Abstract: In recent years, parallel processing and optimization 
algorithms for processing object-oriented databases have drawn a 
considerable amount of attention from the database research community. Two 
general types of algorithms have been introduced: hybrid-hash pointer-based 
algorithms and multiwavef ront algorithms. In this work, we quantitatively 
analyze the two algorithms and develop analytical formulas to capture the 
main performance features of these two approaches. We study their 
performance in three application environments: One is characterized by 
large databases having many object classes, each of which contains a 
large number of instances; the second one is characterized by large 
databases having many object classes, each of which contains a 
relatively small number of instances; and the third one is by large 
databases having object classes of varying sizes. A horizontal data 
partitioning strategy, in which each object class is partitioned into 
horizontal segments stored across all processors, is used in the first 
environment. A class-per-node assignment strategy, in which instances of 
each object class are stored in a single processor, is used in the second 
environment. In the third environment, object classes are partitioned 
horizontally and assigned to a varying number of processors depending on 
their different sizes. Our analytical results show that the multiwavef ront 
algorithm has three distinguishing features which contribute to its better 
performance: 1) two-phase processing strategy, 2.) vertical partitioning 
of horizontal segments , and 3) dynamic determination of 1 collision point 1 
in multiwavef ront propagations which results in an optimized query 
execution plan. We show that if these features are adopted by a 
hybrid-hash, pointer-based algorithm, its performance will be comparable 
with that of the multiwavef ront algorithm because the difference in CPU 
time between them is negligible. The assumed computing environment is a 
network of workstations having a share-nothing architecture. The schema and 
some queries selected from the 007 benchmark are used in the performance 
analyses and comparisons. The queries are modified slightly in different 
data environments in order to reflect the features of diverse database 
applications. (Author abstract) 27 Ref s . 
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Abstract: Nested relations in partitioned normal form (PNF) are an 
important subclass of nested relations that are useful in many 
applications. In this paper we address the question of determining when 
every PNF relation stored under one nested relation scheme can be 
transformed into another PNF relation stored under a different nested 
relation scheme without loss of information, referred to as the two schemes 
being data equivalent. This issue is important in many database 
application areas such as view processing, schema integration, and schema 
evolution. The main result of the paper provides two characterizations of 
data equivalence for nested schemes. The first is that two schemes are data 
equivalent if and only if the two sets of multivalued dependencies induced 
by the two corresponding scheme trees are equivalent. The second is that 
the schemes are equivalent if and only if the corresponding scheme trees 
can be transformed into the other by a sequence of applications of a local 
restructuring operator and its inverse. (Author abstract) 29 Refs. 
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Abstract: Data broadcast is an effective approach to disseminate 
information from a database server to numerous mobile clients in a 
mobile environment. Since a broadcast session contains only a subset of the 
database items, a client might not be able to obtain all its items from the 
broadcast and is forced to request additional ones from the server on 
demand. In this paper, we describe a semantic-based broadcast approach 
which attaches a semantic description to each broadcast unit, called chunk, 
which is a cluster of data items. This allows a client to determine if a 
query can be answered entirely using a broadcast as well as defining the 



1 



precise nature of the remaining items in the form of a 'supplementary' 
query - Chunks could be of different sizes and are hierarchically 
organized. We propose a heuristic to schedule the broadcast order of the 
chunks to improve the tuning time, access time, and a new metric called 
data affinity index. The performances are evaluated via experiments based 
on a simulation model. (Author abstract) 16 Refs. 
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Abstract: Similarity indexing is the supporting technology for fast 
content-based retrieval of large media databases , and many similarity 
index structures have been proposed. Compared with the many structures 
present, less attention has been paid to performance evaluation of index 
structures and theoretic analysis on factors influencing index performance. 
In this paper, we attempt to solve part of the problem and focus our 
research on analyzing the influence of data splitting methods. To give a 
formal definition for index structure performance evaluation, we introduce 
the query distribution probability concept and propose using average search 
cost to evaluate the performance of a similarity indexing structure. We 
choose the simplest case of similarity indexing - nearest-neighbor search 
in our discussion and deduce an expression for the average search cost 
function . Based on analysis of the expression, we proposed some criteria 
that may be useful in index design and implementation. Then we extend these 
conclusions to the general similarity indexing case and use these criteria 
as general rules in index design and implementation. Basic thoughts and 
analysis are detailed, as well as experiment results. (Author abstract) 12 
Refs. , ... 
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Abstract: Complex queries with multiple conjunctive (AND) predicates 
have attracted increasing attention for their importance in OLAP systems. 
Signature index is ideal for fast execution of queries involving multiple 
conjunctive predicates. However, previous signature-based methods did not 
separate the signatures from different attributes. We developed a signature 
augmented access index MG-tree. It ! s main features are: (1) it is a 
light-weight search tree for indexing multiple attributes of text data, (2) 
it eliminates the. mix of signatures from different attributes, so the 
construction and maintenance of the index is easy and search is 
efficient . Our analyses and experiments show that the MG-tree achieved a 
significant improvement in terms of access speed and storage space overhead 
over both tree-like indexes and previous multi-dimensional signature 
indexes, and is easier to build and maintain. MG-tree can be employed in 
practical DBMS implementations. (Author abstract) 12 Refs. 
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Abstract: Much work has been accomplished in the past on the subject of 
parallel query processing and optimization in parallel relational 
database systems; however, little work on the same subject has been done in 
parallel object-oriented database systems. Since the object-oriented view 
of a database and its processing are quite different from those of a 
relational system, it can be expected that techniques of parallel query 
processing and optimization for the latter can be different from the 
former. In this paper, we present a general framework for parallel 
object-oriented database systems and several implemented query 
processing and optimization strategies together with some performance 
evaluation results. In this work, multiwavef ront algorithms are used in 
query processing to allow a higher degree of parallelism than the 
traditional tree-based query processing. Four optimization strategies, 
which are designed specifically for the multiwavef ront algorithms and for 
the optimization of single as well as multiple queries, are introduced. The 
query processing algorithms and optimization strategies have been 
implemented on a parallel computer, nCUBE2; and the results of a 
performance evaluation are presented in this paper. The main emphases and 
the intended contributions of this paper are (1) data partitioning , 
query processing and optimization strategies suitable for parallel 
OODBMSs, (2) the implementation of the multiwavef ront algorithms and 
optimization strategies, and (3) the performance evaluation results. 
(Author abstract) 54 Refs. 
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Abstract: Signature files have been studied extensively as an access 
method for textual databases . Many approaches have been proposed for 
searching signatures files efficiently - However, different methods make 
different assumptions and use different performance measures, making it 
difficult to compare their performance. In this paper, we study three basic 
methods proposed in the literature, namely, the indexed descriptor file, 
the two-level super-imposed coding scheme, and the partitioned signature 
file approach. The contribution of this paper is two-fold. First, we 
present a uniform analytical performance model so that the methods can be 
compared fairly and consistently. The analysis shows that the two-level 
superimposed coding scheme, if stored in a transposed file, has the best 
performance. Second, we extend the two-level superimposed coding method 
into a multilevel superimposed coding method, we obtain the optimal number 
of levels for the multilevel method and show that for databases with 



i 



reasonable size the optimal value is much larger than 2, which is assumed 
in the two-level method. The accuracy of the analytical formula is 
demonstrated by simulation. (Author abstract) 21 Ref s . 
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Abstract: An important feature of database technology of the nineties is 
the use of parallelism for speeding up the execution of complex queries. 
This technology is being tested in several experimental database 
architectures and a few commercial systems for conventional 
select-project- join queries. In particular, hash-based fragmentation is 
used to distribute data to disks under the control of different processors 
in order to perform selections and joins in parallel. With the development 
of new query languages, and in particular with the definition of transitive 
closure queries and of more general logic programming queries, the new 
dimension of recursion has been added to query processing. Recursive 
queries are complex; at the same time, their regular structure is 
particularly suited for parallel execution, and parallelism may give a high 
efficiency gain. We survey the approaches to parallel execution of 
recursion queries that have been presented in the recent literature. We 
observe that research on parallel execution of recursive queries is 
separated into two distinct subareas, one focused on the transitive 
closure of Relational Algebra expressions, the other one focused on 
optimization of more general Datalog queries. Though the subareas seem 
radically different because of the approach and formalism used, they have 
many common features. This is not surprising, because most typical Datalog 
queries can be solved by means of the transitive closure of simple 
algebraic expressions. We first analyze the relationship between the 
transitive closure of expressions in Relational Algebra and Datalog 
programs. We then review sequential methods for evaluating transitive 
closure," distinguishing' iterative and direct methods. We address the 
parallelization of these methods, by discussing various forms of 
parallelization . Data fragmentation plays an important role in obtaining 
parallel execution; we describe hash-based and semantic fragmentation. 
Finally, we consider Datalog queries, and present general methods for 
parallel rule execution; we recognize the similarities between these 
methods and the methods reviewed previously, when the former are applied to 
linear Datalog queries. We also provide a quantitative analysis that shows 
the impact of the initial data distribution on the performance of methods. 
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Abstract: A multidatabase system (MDBS) is a system that integrates the 
operational data of several autonomous database systems and provide a 
uniform interface and control mechanisms to control access to those data. 
To efficiently retrieve and manipulate the data stored in MDBS, a metadata 
dictionary is needed as a repository of essential information for 
reasoning, controlling, and maintaining the retrieval/manipulation 
processes. In this paper we developed a two-level active metadata 
dictionary approach based on logic for building a metadata dictionary, 
query processing, and maintenance in MDBS. The low-level metadata 
dictionaries (LLMDs) keep metadata for each corresponding local database in 
MDBS, respectively. The- high-level metadata dictionary (HLMD) integrates 
the metadata about all LLMDs. The evaluation strategy is a top-down 
approach, start with consideration of a query as a global goal to be 
achieved. Unify the query with rules successively to decompose the goal 
into subgoals which can be evaluated against extensional database. Then 
translate these subgoals into corresponding queries against underlying 
DBMSs, respectively. The database integration strategy includes two phases: 
schema translation and schema integration. It is a bottom-up approach 
integrating schema from the underlying database schemas . Update may cause 
inconsistencies in MDBS. We use incremental integrity constraint checking 
to preserve consistency. The semantic query optimization evaluation can 
be partitioned into two phases: compilation phase and evaluation phase. 
During the compilation phase residues are computed and associated with 
deductive rules through partial subsumption algorithm. In evaluation phase, 
redundant residues are eliminated and then translate it into query against 
underlying DBMS. (Author abstract) Refs. 
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DNA/protein sequence comparison, usually organized as a database 
search, is a very powerful tool in modern molecular biology. In recent 
years, the rapid growth of sequence databases in their size as well as in 
their number poses demands for efficient programs to search these 
databases. In this thesis a distributed system capable of performing 
sequence searches on multiple biological databases simultaneously has 
been designed, implemented and tested. 

The two-phase nature of FASTA algorithm makes it the algorithm of the 
choice to be modified for our distributed system. The system is built on a 
three-tier architecture to support a flexible, expendable, and most 
importantly, user transparent server network. The system is capable of 
searching multiple homogeneous and heterogeneous databases in a single 
query. Also, it can handle concurrent multiple client connections. 

In summary, the work accomplished in this thesis has demonstrated that 
the performance of sequence queries on multiple biological databases 
can be significantly improved if a distributed algorithm is used, compared 
to running uncoordinated parallel searches on these individual databases. 
It also shows that the usability of existing biological databases and 
database search programs can be greatly enhanced if multiple databases 
can be queried simultaneously, as one logical database, because users 
obtain the search results in one compiled report, which is not available if 
they run the searches separately on individual databases. Moreover, 
this thesis demonstrates that the Client/Server computing model used in 
biological database queries can greatly expand the possibilities to build a 
centralized biological data warehouse to facilitate multiple remote client 
requests through the Internet. 
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Databases increasingly integrate different types of information such 
as multimedia data. As a result, it is becoming necessary to support 
efficient storage and retrieval of multi-dimensional data. In several 
modern database applications, both the dimensionality and the amount of 
data that needs to be processed are increasing rapidly. Therefore, it is 
important to develop techniques that overcome the scalability and the 
dimensionality problems of multi-dimensional data sets. Since the amount of 
data is large, it is crucial to develop techniques that exploit parallelism 
in large-scale databases. In this context, we propose partitioning and 
declustering techniques for multi-disk architectures. Several effective 
solutions for the high dimensionality problem are also proposed: access 
structures for efficient searching , and dimensionality reduction 
techniques to remove the curse of dimensionality. In particular, we propose 
a compression based index structure, a clustering based approximate search 
technique, and a dimensionality reduction technique using inner product 
approximations. Finally, we discuss two new types of queries and propose 
efficient techniques to process them. Extensive experimental evaluation of 
all presented techniques has been performed and .comparison with other . 
state-of-the-art approaches is presented. 
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Similarity searches in sequence databases are important in many 
application domains such as information retrieval, data mining, and 
clustering. Although sequential scanning can be used to perform similarity 
searches, it may require enormous processing time over large sequence 
databases • Recently, several indexing techniques have been proposed to 
speed up the processing of similarity searches. 

Most of the previous techniques use the Euclidean distance metric as a 
similarity measure. However, in many applications, the sampling rates and 
the lengths of sequences may be different, making it difficult or 
impossible to use the Euclidean distance metric. In the area of speech 
recognition, this problem has been approached using the similarity measure, 
called the time warping distance, which allows sequences to be stretched or 
compressed along the time axis. 

In this dissertation, we investigate a set of indexing techniques for 
the fast retrieval of similar (sub) sequences of different lengths or 
different sampling rates. The goal of our approach is to achieve the high 
search performance without missing any qualified answers. 

We first propose a whole sequence searching method, which extracts a 
time-warping invariant feature vector from -each .sequence and uses a 
lower-bound time warping distance function to compute the distance of any 
two feature vectors. The proposed method efficiently performs similarity 
search using a multi-dimensional index built on the set of feature 
vectors . 

We then propose a subsequence searching method, which uses a 



disk-based suffix tree as an index structure and employs lower-bound time 
warping distance functions to filter out dissimilar subsequences. To make 
the index structure compact and thus accelerate the query processing, the 
proposed method introduces the categorization and sparse indexing schemes. 

For a database with long data sequences, we propose a segment -based 
subsequence searching scheme which changes the similarity measure from time 
warping to piece-wise time warping in order to reduce the number of 
possible subsequences to be compared. For a database with multi-dimensional 
data sequences such as image sequences and video streams, we extend the 
proposed techniques by introducing the multi-dimensional time warping 
distance function. Finally, we apply the proposed subsequence searching 
techniques to the problem of discovering and matching sequential 
association rules . 
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Several database modeling and querying innovations are introduced 
in a stream-based multimedia database system. The data model 
advances/builds on prior work, focusing on the use of 
<italic>streams</italic> for storing temporally ordered sequences of 
information. New stream constructs are introduced and formalized, allowing 
manipulation and organization of streams: <italic> substreams</italic>, 
enabling users to define logical partitions of information within a 
stream using conditions specified on its contents; <italic>aggregated 
streams</italic> / generalizing the operation of combining two or more 
streams into one entity through a function stipulating aggregation 
behavior; and <italic> derived streams</italic>, generated through the 
application of a method to one or more streams. <italic>Stream 
relationships</italic>, expressing relations between streams and other 
database objects, are also developed. Formal notation and algorithms for 
supporting these constructs are developed. The stream constructs complement 
existing object-oriented database models, and increase the overall 
capability in multimedia data representation. 

Querying facilities have been developed to support the stream 
constructs. The beginnings of a new stream algebra are described, defining 
stream constructs and basic operations required for querying. The stream 
algebra also sets the foundation for exploring optimization of queries 
involving streams. An extended visual query language for the new constructs 
is also illustrated, with operators supporting set and temporal predicates 
on streams, and a grouping mechanism to* reduce graphical complexity. The 
visual query language additionally enables creation of views on streams by 
permitting customization of queries for different user models. 
Investigation of indexing to support querying of the constructs was 
performed; use of these methods is examined to optimize retrieval and other 
query processing operations. 

To validate the work, an integrated multimedia application, <italic> 
TimeLine</italic>, has been developed in the medical domain. Patient 
information in different clinical databases is re-organized in accordance 
to a stream-based schema. TimeLine facilitates the longitudinal view of 
patient's medical histories, presenting physicians with a problem-oriented 



temporal visualization of data. 

Preliminary user testing and evaluation were performed. Users' 
comprehension of the data modeling and visual querying concepts was tested. 
TimeLine was also evaluated, measuring its impact on physicians and the 
acceptance of the interface relative to the current clinical environment. 
Results from the evaluations were positive overall. 
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A data warehouse is a stand-alone repository of information consisting 
of “ interesting” and “ historic” data from several 
, heterogeneous, operational databases , and the size of data warehouse is 
very large and grows over time. Data warehouses are usually dedicated to 
the processing of queries issued by decision support systems (DSS) . The 
response time of DSS queries is typically several orders of magnitude 
higher than the response time of OLTP (OnLine Transaction Processing) 
queries. Since DSS queries are often submitted interactively, techniques 
for reducing their response time are important. 

The caching of query results is one such technique particularly well 
suited to the DSS environment. In this thesis, we present an intelligent 
cache manager for such an environment. The cache manager can lookup queries 
either based on an exact query match or using a <italic>query split 
</italic> algorithm to efficiently find query results which subsume the 
submitted query. The cache manager dynamically maintains the cache content 
by deciding whether a new query result should be admitted to the cache and 
if so, which query results should be evicted from the cache. The decisions 
are aimed at minimizing query response time. The decisions are based on the 
execution cost of each query, the size of each query result, the reference 
frequency to each result, the cost of maintenance of each result due to 
updates of the base tables, and the frequency of updates. Experimental 
evaluation shows that the manager can significantly improve performance 
when compared to similar systems. 

Since Web documents vary in their size, and the cost of their 
materialization depends upon the network delays, a profit based cache 
replacement algorithm can be applied to Web caching. At the same time, the 
cache must guarantee some form of consistency of the cached documents. 
Cache consistency algorithms enforce appropriate guarantees about the ■ 
staleness of the cached documents. We have developed a unified cache 
maintenance algorithm which integrates both cache replacement and 
consistency algorithms. A trace-driven experimental study shows that the 
unified algorithm not only improves the average response time but also 
reduces the significant number of stale documents returned to the clients. 
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This thesis presents the design and implementation of a database query 
processing engine that is optimized for access to tertiary memory devices. 
Tertiary memory devices ' provide a cost-effective solution for handling the 
on-going information explosion. While cheap and convenient, they pose new 
optimization challenges. Not only are tertiary devices three orders of 
magnitude slower than disks, but they also have a highly non-uniform access 
latency. Therefore, it is crucial to carefully reduce and reorder I/O on 
tertiary memory using effective query scheduling, batching, caching, 
prefetching and data placement techniques. 

We make two key modifications to an existing query processing 
architecture to support such aggressive optimizations: The first is a 
scheduler that uses system-wide information to make query scheduling, 
caching and device scheduling decisions in an integrated manner. The second 
is a reorderable executor that can process each query plan in the order in 
which data is made available by the scheduler rather than demand and 
process data in a fixed order, as in most conventional query execution 
engines . The two together provide unprecedented opportunities for 
optimizing accesses to tertiary memory. We have extended the scPOSTGRES 
database system with these optimizations. Measurements on the prototype 
yielded almost an order of magnitude improvement on the scSEQUOIA-2000 
benchmark and on queries over synthetic datasets. 

We explore data placement techniques on tertiary memory devices to 
enable better clustering. This thesis concentrates on data placement issues 
for large multidimensional arrays— one of the largest contributors of data 
volume in many database systems. We discuss four techniques for doing 
this: (1) storing the array in multidimensional "chunks" to minimize the 
number of blocks fetched, (2) reordering the chunked array to minimize seek 
distance between accessed blocks, (3) maintaining redundant copies of the 
array, each organized for a different chunk size and ordering and (4) 
partitioning the array onto platters of a tertiary memory device so as to 
minimize the number of platter switches. Measurements on data obtained from 
global change scientists show that accesses on arrays organized using these 
techniques are often an order of magnitude faster than on the unoptimized 
data . 
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Distributed database technology is expected to have a significant 
impact on data processing in the upcoming years because distributed 
database systems have many potential advantages over centralized systems 
for geographically distributed organizations. Data allocation and query 
optimization are two of the most important aspects of distributed database 



design. Data allocation involves placing a database and the applications 
that run against it in the multiple sites of a network. It is a very 
complex problem consisting of two processes: data fragmentation and 
fragment allocation. Data fragmentation involves the partitioning of each 
relation into a group of fragment relations while fragment allocation deals 
with the distribution of these fragmented relations across the sites of the 
distributed system. Query optimization includes designing algorithms 
that analyze and convert queries into a set of data manipulation 
operations. Both the data allocation and query optimization problems 
are NP-hard in nature and notoriously difficult to solve. We have attempted 
to combine the two highly interrelated and interactive decision processes 
in data allocation by formulating them as integer programs taking into 
consideration different* constraints and- under various assumptions. Various 
solution methods are discussed and a new linearization method is 
investigated. We next analyze the query optimization problem and reduce 
it to a join ordering problem. Several heuristics and a genetic algorithm 
have been developed for solving the join ordering problem. Some 
computational experiments on these algorithms were conducted and solution 
qualities compared. The computation experiments show that the suggested 
linearization method performs clearly and consistently better than a 
currently widely used method and that heuristics and genetic algorithms are 
viable methods for solving query optimization problem. 

It is anticipated that the models and solution methods developed in 
this study for data allocation and query optimization in distributed 
database systems may be of practical as well as theoretical use. 
Nevertheless, much more needs to be done to solve the distributed database 
design problems in order to achieve its potential benefits. Our models and 
solution methods can be the starting point for eventual resolution of these 
complex problems in large scale distributed database systems. 
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Much work has been accomplished in the past on the subject of parallel 
query processing and optimization in parallel relational database 
systems. However, little work on the same subject has been done in parallel 
object-oriented database systems. Since the object-oriented view of a 
database- and its processing are quite different from those of a relational 
system, it can be expected that techniques of parallel query processing 
and optimization for the latter can be different from the former. In this 
dissertation, we present two parallel architectures, a general framework 
for parallel object-oriented database systems, several implemented 
query processing and optimization strategies together with some 
performance evaluation results. In this work, multi-wavef ront algorithms 
are used in query processing to allow a higher degree of parallelism than 
the traditional tree-based query processing. Four optimization 
strategies, which are designed specifically for the multi-wavef ront 
algorithms and for the optimization of single as well as multiple queries, 
are introduced and evaluated. A distributed result collection scheme which 
is designed to support retrieval queries is also introduced. Furthermore, 
two parallel architectures, namely, master-slave and peer-to-peer 
architectures are compared. A comparison is also made for two data 



placement strategies, namely, class-per-node vertical partitioning and 
hybrid partitioning . The query processing algorithms, four optimization 
strategies and the distributed result collection scheme have been 
implemented on a parallel computer nCUBE2, and the results of a performance 
evaluation are presented in this dissertation. The main emphases and the 
intended contributions of this dissertation are (1) data partitioning , 
parallel architecture, query processing, query optimization and 
result collection strategies suitable for parallel OODBMSs; (2) the 
implementation of these strategies; and (3) the performance evaluation 
results . 
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One- of the primary. functions of computers is to store information, 
i.e., to deal with long lived or persistent data. Programmers working with 
persistent data structures are faced with the problem that there are two, 
mostly incompatible, views of structured data, namely data in primary and 
secondary storage. Traditionally, these two views of data have been dealt 
with independently by researchers in the programming language and database 
communities . 

Significant research has occurred over the last decade on efficient 
and easy-to-use methods for manipulating persistent data structures in a 
fashion that makes the secondary storage transparent to the programmer. 
Merging primary and secondary storage in this manner produces a 
single-level store, which gives the illusion that data on secondary storage 
is accessible in the same way as data in primary storage. In complex design 
environments, a single-level store offers substantial performance 
advantages over conventional file or database access. These advantages are 
crucial to unconventional database applications such as computer-aided 
design, text management, and geographical information systems. In addition, 
a single-level store reduces complexity in a program by freeing the 
programmer from the responsibility of dealing with two views of data. 

This dissertation proposes, develops and investigates a novel approach 
for implementing singlerlevel stores using memory mapping. Memory mapping 
is the use of virtual memory to map data stored on secondary storage into 
primary storage so that the data is directly accessible by the processor's 
instructions. In this environment, all transfer of data to and from the 
secondary store takes place implicitly during program execution. The 
methodology was motivated by the significant simplification in expressing 
complex data structures offered by the technique of memory mapping. This 
work parallels other proposals that exploit the potential of memory 
mapping, but develops a unique approach based on the ideas of segmentation 
and exact positioning of data in memory. Rigorous experimentation has been 
conducted to demonstrate the effectiveness and ease of use of the proposed 
methodology vis-a-vis the traditional approaches of manipulating structured 
data on secondary storage. 

The behaviour of high-level database algorithms in the proposed memory 
mapped highly parallel environment, especially in systems, has been 
investigated. A quantitative analytical model of computation in this 
environment has been designed and validated through experiments conducted 



on several database join algorithms; parallel multi-disk versions of 
the traditional join algorithms were developed for this purpose. An 
analytical model of the system is extremely useful for data structure and 
algorithm designers for predicting general performance behaviour without 
having to construct and test specific algorithms. More importantly, a 
quantitative model is an essential tool for database subsystems such as a 
query optimizer - 
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In the relational data model a relation is a set of tuples; therefore 
the same tuple cannot exist more than once in a relation. However, in 
practice the need arises for relations with duplicates, called 
multirelations . Many database systems, in coping with duplicates, are 
inconsistent and ill-defined. The first part of this thesis provides a 
theoretic and practical framework for integrating multirelations into 
relational databases. 

We argue that a multirelation contains semantically incomplete 
data, being a vertical section of the complete relation, a relation 
without duplicates. The multirelation constitutes the output columns of the 
complete relation and the rest are called hidden columns. The multirelation 
partially describes the complete relation entities and is meaningful only 
in the context of the complete relation. 

Accordingly, base relations or views should not be extended to 
include multirelations. However, a multirelation serves naturally as query 
output, where often partial information is desired. We define the notion of 
full multirelational expressiveness as any meaningful query with 
multirelational output (a multirelational query) . Such a query specifies a 
complete relation and designates its hidden and output columns. We show how 
any relational query language can be extended to achieve full 
multirelational query expressiveness, and we present a description of its 
extension to the query language QUEL. 

We also show how to use tableau techniques to check equivalence 
among conjunctive multirelational queries and how to minimize such queries. 
In the presence of functional dependencies further query simplification 
is possible using the chase process. The new conversion chase rule is 
introduced which removes hidden columns from the complete relation of the 
query and thus simplifies it. 

The second part of this thesis investigates database 
fd-acyclicity . Acyclic schemes allow evaluation of join-project queries 
using semi join instead of join operations. In the presence of functional 
dependencies some cyclic schemes acquire this property, and we address 
recognizing these schemes. 

We present and prove an fd-acyclicity decision algorithm for an 
important class of cyclic schemes called Acliques and an arbitrary set of 
functional dependencies. We also suggest a decision algorithm for general 
database schemes, based on the construction of the cycle space database 
instance. (Abstract shortened with permission of author.) 
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A processor functioning as a coprocessor attached to a central processing 
complex provides efficient execution of the functions required for database 
processing: sorting, merging, joining, searching and manipulating fields in 
a host memory system. The specialized functional units: a memory interface 
and field extractor/assembler, a Predicate Evaluator, a combined 
sort/merge/ join unit, a hasher, and a microcode control processor, are all 
centered around a partitioned Working Store. Each functional unit is 
pipelined and optimized according to the function it performs, and executes 
its portion of the query efficiently* . All functional units execute 
simultaneously under the control processor to achieve the desired results. 
Many different database functions can be performed by chaining simple 
operations together. The processor can effectively replace the CPU bound 
portions of complex database operations with functions that run at the 
maximum memory access rate improving performance on complex queries . 
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A processor functioning as a coprocessor attached to a central processing 
complex provides efficient execution of the functions required for database 
processing: sorting, merging, joining, searching and manipulating fields in 
a host memory system. The specialized functional units: a memory interface 
and field extractor/assembler, a Predicate Evaluator, a combined 
sort/merge/ join unit, a hasher, and a microcoded control processor, are all 
centered around a partitioned Working Store. Each functional unit is 
pipelined and optimized according to the function it performs, and executes 
its portion of the query efficiently . All functional units execute 
simultaneously under the control processor to achieve the desired results. 
Many different database functions can be performed by chaining simple 
operations together. The processor can effectively replace the CPU bound 
portions of complex database operations with functions that run at the 
maximum memory access rate improving performance on complex queries. 
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Abstract: Nowadays feature vector based similarity search is increasingly 
emerging in database systems. Consequently, many multidimensional data 
index techniques have been widely introduced to the database researcher 
community. These index techniques are categorized into two main classes: SP 
(space partitioning ) /KD- tree-based and DP (data partitioning 
) /R-tree-based. Recently, a hybrid index structure has been proposed. It 
combines both SP/KD-tree-based and DP/R-tree-based techniques to form a 
new, more efficient index structure. However, weaknesses are still existing 
in techniques above. In this paper, we introduce a novel and flexible index 
structure for multidimensional data, the SH-tree (Super Hybrid tree) . 
Theoretical analyses show that the SH-tree is a good combination of both 
techniques with respect to both presentation and search algorithms. It 
overcomes the shortcomings and makes use of their positive aspects to 
facilitate efficient similarity searches . (36 Refs) 
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Abstract: SchemaSQL is a recently proposed extension to SQL for enabling 
multi- database interoperability. Several recently identified 
applications for SchemaSQL, however, mainly rely on its ability to treat 
data and schema labels in a uniform manner, and call for an efficient 
implementation of it on a single RDBMS . We first develop a logical algebra 
for SchemaSQL by combining classical relational algebra with four 
restructuring operators-unfold, fold, split , and unite-originally 
introduced in the context of the tabular data model by Gyssens et al . , 
(1996), and suitably adapted to fit the needs of SchemaSQL. We give an 
algorithm for translating SchemaSQL queries/views involving restructuring, 
into the logical algebra above. We also provide physical algebraic 
operators which are useful for query optimization . Using the various 
operators as a vehicle, we give several alternate implementation strategies 
for SchemaSQL queries/ views , All the proposed strategies can be 
implementation non-int rusively on top of existing relational DBMS, in that 
they do not require any additions to the existing set of plan operators. We 
conducted a series of performance experiments based on TPC-D benchmark 
data, using the IBM DB2 DBMS running on Windows NT. In addition to showing 
the relative tradeoffs between various alternate strategies, our 
experiments show the feasibility of implementing SchemaSQL on top of 
traditional RDBMS in a non-intrusive manner. Furthermore, they also suggest 
new plan operators which might profitably be added to the existing set 
available to relational query optimizers , to further boost their 

performance. (32 Refs) 
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Abstract: One of the common queries in many database applications is 
finding approximate matches to a given query item from a collection of data 
items. For example, given an image database, one may want to retrieve all 
images that are similar to a given query image. Distance-based index 
structures are proposed for applications where the distance computations 
between objects of the data domain are expensive {such as high-dimensional 
data) and the distance function is metric. In this paper we consider using 
distance-based index structures for similarity queries on large metric 
spaces. We elaborate on the approach that uses reference points (vantage 
points) to partition the data space into spherical shell-like regions in 
a hierarchical manner. We introduce the multivantage point tree structure 
(mvp-tree) that uses more than one vantage point to partition the space 
into spherical cuts at each level. In answering similarity-based queries, 
the mvp-tree also utilizes the precomputed (at construction time) distances 
between the data points and the vantage points. We summarize the 
experiments comparing mvp-trees to vp-trees that have a similar 
partitioning strategy, but use only one vantage point, at each level and 
do not make use of the precomputed distances. Empirical studies show that 
the mvp-tree outperforms the vp-tree by 20% to 80% for varying query ranges 
and different distance distributions. Next, we generalize the idea of using 
multiple vantage points and discuss the results of experiments we have made 
to see how varying the number of vantage points in a node affects search 
performance and how much is gained in performance by making use of 
precomputed distances. The results show that, after all, it may be best to 
use a large number of vantage points in an internal node in order to end up 
with a single directory node and keep as many of the precomputed distances 
as possible to provide more efficient filtering during search 
operations . Finally, we provide some experimental results that compare 
mvp-trees with M-trees, which is a dynamic distance-based index structure 
for metric domains. (24 Refs) 
Subfile: C 
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Abstract: Complex queries with multiple conjunctive (AND) predicates 



have attracted increasing attention for their importance in OLAP 
systems.* The signature index is ideal for the fast execution of queries 
involving multiple conjunctive predicates. However, previous 
signature-based methods did not separate the signatures from different 
attributes. We have developed a signature-augmented access index, called an 
MG-tree. Its main features are: (1) it is a lightweight search tree for 
indexing multiple attributes of text data, and (2) it eliminates the mix of 
signatures from different attributes, so the construction and maintenance 
of the index is easy and searching is efficient . Our analyses and 
experiments show that the MG-tree achieved a significant improvement in 
terms of access speed and storage space overhead over both tree-like 
indexes and previous multi-dimensional signature indexes, and is easier to 
build and maintain. The MG-tree can be employed in practical DBMS 
implementations. (12 Refs) 
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Abstract: The issue of interoperability among multiple autonomous 
databases has attracted a lot of attention from researchers. The past 
research on this issue can be roughly divided into two main categories: 
the tightly-integrated approach that integrates databases by building an 
integrated schema; and the loosely-integrated approach that achieves 
interoperability by using a multidatabase language. Past efforts focus on 
the issues in the first approach. The problem with the first approach is 
that it lacks a convenient representation of the integrated schema at the 
system level and a sound mathematical basis for data manipulation in a 
multidatabase system. We propose to use hyperrelations as a powerful and 
succinct model for the global level representation of heterogeneous 
database schemas . A hyperrelation has the structure of a relation, but its 
contents are the schemas of the semantically equivalent local relations in 
the databases. With this representation, the metadata of the global 
database', local databases and the data of ■ these databases are' all 
representable by using the structure of a relation. The impact of such a 
representation is that all the elegant features of relational systems can 
be easily extended to multidatabase systems. A hyperrelational algebra is 
designed accordingly. This algebra is performed at the multidatabase 
systems (MDBS) level such that query transformation and optimization is 



supported on a sound mathematical basis. (52 Refs) 
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Abstract: To an increasing extent, applications demand the capability of 
retrieval based on image content. As a result, large image database systems 
need to be built to support effective and efficient accesses to image data 
on the basis of content. In this process, significant features must first 
be extracted from image data in their pixel format. These features must 
then be. classified and indexed to .assist efficient retrieval of image 
content. However, the issues central to automatic extraction and indexing 
of image content largely remain an open problem. Tools are not currently 
available with which to accurately specify image content for image database 
uses. In this paper, we investigate effective block-oriented image 
decomposition structures to be used as the representation of images in 
image database systems. Three types of block-oriented image decomposition 
structures, namely, quad-, quin- and nona-trees, are compared. In analyzing 
and comparing these structures, wavelet transforms are used to extract 
image content features. Our experimental analysis illustrates that 
nona-tree decomposition is the most effective of the three decomposition 
structures available to facilitate effective content-based image retrieval. 
Using the nona-tree structure to represent image content in an image 
database , various types of content-based queries and efficient 
image retrieval can be supported through novel indexing and searching 
approaches. We demonstrate that the nona-tree structure provides a highly 
effective approach to supporting automatic organization of images in large 
image database systems. (28 Refs) 
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Abstract: An approach to query processing in object oriented distributed 
database systems is proposed. Distributed PIOS is a server that supports an 
object-oriented data model, physical data independence (i.e. different 
strategies for storing class hierarchies, grouping, horizontal and vertical 
partitioning of objects), and fragmentation transparency (i.e. 
transactions are not aware of the distribution of database fragments on 
several nodes of a computer network) . The problem of the optimization 
of distributed queries (i.e. determining which data must be accessed at 
which site and which data must be transmitted among sites) is the focus of 
the paper. (5 Refs) 
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Abstract: Query processing is a very important issue in distributed 
databases . Many algorithms have been proposed to process distributed 
queries. efficiently , However, most of the algorithms use oversimplified 
cost models and ignore the impact of work load generated by other 
applications. As a result, load balancing is difficult to achieve in a real 
environment. We provide an adaptive scheme to do load balancing 
effectively. The scheme takes into account an environment in which the load 
at different sites varies. The partition and replicate strategy algorithm 
is used to explain how to achieve load balancing in a multi-user 
environment. The scheme also has learning capability such that the 
parameters of cost estimation functions can be adaptively adjusted as the 
environment changes. (20 Refs) 
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Abstract: Signature files have been studied extensively, as an access 
method for textual databases . Many approaches have been proposed for 
searching signatures files efficiently - However, different, methods .make 
different assumptions and use different performance measures, making it 
difficult to compare their performance. In this paper, we study three basic 
methods proposed in the literature, namely, the indexed descriptor file, 
the two-level superimposed coding scheme, and the partitioned signature 
file approach. The contribution of this paper is two-fold. First, we 
present a uniform analytical performance model so that the methods can be 
compared fairly and consistently. The analysis shows that the two-level 
superimposed coding scheme, if stored in a transposed file, has the best 
performance. Second, we extend the two-level superimposed coding method 
into a multilevel superimposed coding method, we obtain the optimal number 
of levels for the multilevel method and show that for databases with 
reasonable size the optimal value is much larger than 2, which is assumed 
in the two-level method. The accuracy of the analytical formula is 
demonstrated by simulation. (21 Refs) 

Subfile: C 

Descriptors: information retrieval 

Identifiers: signature file methods; text retrieval; access method; 
textual databases; performance measures; indexed descriptor file; two-level 
superimposed coding scheme; partitioned signature file approach; 
simulation . . .... 



t 



Class Codes: C7250 (Information storage and retrieval) 
Copyright 1995, I EE 



21/5/31 (Item 10 from file: 2) 

DIALOG (R) File 2 -INSPEC 

(c) 2003 Institution of Electrical Engineers. All rts . reserv. 

4846580 INSPEC Abstract Number: C9502-6160K-004 

Title: Specifying rule-based query optimizers in a reflective 

framework 

Author (s): Fegaras, L.; Maier, D.; Sheard, T. 

Author Affiliation: Dept. of Comput . Sci. & Eng., Oregon Graduate Center, 
Beaverton, OR, USA 
p. 146-68 

Editor(s): Ceri, S.; Tanka, K. ; Tsur, S. 
Publisher: Springer-Verlag, Berlin, Germany 

Publication Date: 1993 Country of Publication: West Germany xl+488 
pp. 

ISBN: 3 540 57530 8 

Conference Title: Third International Conference, DOOD '93. Deductive and 
Object-Oriented Databases 

Conference Date: 6-8 Dec. 1993 Conference Location: Phoenix, AZ, USA 
Language: English Document Type: Conference Paper (PA) 
Treatment: Practical (P) 

Abstract: Numerous 'structures for 'database query optimizers have 
been proposed. Many of those proposals aimed at automating the construction 
of query optimizers from some kind of specification of optimizer 

behavior. These specification frameworks do a good job of partitioning 
and modularizing the kinds of information needed to generate a query 
optimizer . Most of them represent at least part of this information in a 
rule-like form. Nevertheless, large portions of these specifications still 
take a procedural form. The contributions of this work are threefold. We 
present a language for specifying optimizers that captures a larger portion 
of the necessary information in a declarative manner. This language is in 
turn based on a model of query rewriting where query expressions carry 
annotations that are propagated during query transformation and planning. 
This framework is reminiscent of inherited and synthesized attributes for 
attribute grammars, and we believe it is expressive of a wide range of 
information: logical and physical properties, both desired and delivered, 
cost estimates, optimization contexts, and control strategies. Finally, we 
present a mechanism for processing optimizer specifications that is based 
on compile-time reflection. This mechanism proves to be succinct and 
flexible, allowing modifications of the specification syntax, incorporation 
of new capabilities into generated optimizers, and retargeting the 
translation to a variety of optimization frameworks. We report on an 
implementation of our ideas using the CRML reflective functional language 
and on optimizer specifications we have written for several query algebras. 
(13 Refs) 
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Abstract: The paper analyses and simulates the impact of providing voice 
and non-voice personal communications services (PCS) on the volume of 
network database transactions. A number of activities in PCS, such as user 
mobility and call origination and delivery, require data management 
functions. These include mobility registration, radio channel management, 
service profile query , security-related functions such as 
authentication and privacy, and special billing arrangements. A large 
portion of these data management activities may be performed using elements 
of the intelligent network such as switches (e.g., service switching point 
or SSP)- and network -databases (e.g.> service control point or SCP) .■ The 
authors extend the single logical database model to include data 
partitioning into multiple databases . This model is helpful in case a 
single database is inadequate to handle all the transaction volume 
generated due to PCS. They give a first-cut analysis using such a model. A 
simulation framework is described for data management under various 
scenarios. This work pertains mainly to data management. In addition, 
qualitative effects of data partitioning on signaling traffic are 
outlined. (6 Refs) 
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ABSTRACT: Vertical fragmentation and access path selection are 
interdependent techniques in physical database design used to enhance 
performance in database systems. While vertical fragmentation in 
relational databases deals with assignment of attributes to physical files, 
access path selection deals with searching efficiently the physical 



location of data records. Vertical fragmentation is a combinatorial 
optimization problem that is NP-hard in most cases. We propose a genetic 
algorithm approach for the vertical fragmentation problem while addressing 
access path selection. The effectiveness and efficiency of the genetic 
algorithm are illustrated through several database design problems, 
ranging from 10 attributes/5 transactions to 30 attributes/18 transactions. 
In most cases, our design solutions match the global optimum solutions 
obtained from an exhaustive enumeration. Compared to uhpartitioned 
databases, our design solution results in substantial savings (up to 80%) 
in the number of disk accesses. Reprinted by permission of the publisher. 
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ABSTRACT: The need for automatic text retrieval (TR) , also known as 
document retrieval (DR) has caught the attention of researchers in natural 
language processing (NLP) . An examination is made of DR's key properties. 
Past experience in the field is summarized, and various specific NLP 
research strategies targeting this form of information processing are 
reviewed. Although conventional DR services continue to make heavy use of 
strongly controlled indexing languages, indexing increasingly involves 
terms drawn from the natural language of documents. These simple 
natural-language indexing techniques have been shown adequate in many 
experiments, though not on a really large scale. These techniques are also 
beginning to be used' for TR. However, the greater information detail in 
full text apparently calls for more sophisticated NLP-based approaches to 
indexing and retrieval. It is suggested that appropriate strategies for 
this new situation should follow the simple DR methods, extending them to 
handle compound terms and similar descriptive units. 
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...TEXT: various services. The long-running debate on controlled vs. 
natural language indexing has become less important as many commercial 
databases now use both. Most searches in these databases are done for end 
users by professional intermediaries who. . . context. In any case, 
grammatical and statistical methods are increasingly combined. 

The proposal described in the following sections develops these themes 
and investigates the role NLP may now play in full-text searching. The 
proposal... and does, for instance, exploit a store of paradigmatic 
knowledge. It may be difficult to convey the significance of statistical 
data; and while artificial description forms, like predicate -argument 
structures, can be applied in TR in a way that is hidden, so users are not 
. . . vocabulary problem in human-system communication. Commun. ACM, 30, 11 
(1987), 964-971. 

9. Hahn, U. Topic parsing : Accounting for text macro structures in 
full-text analysis. Inf. Process. Manage., 26, 1 (1990), 135-170... 
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ABSTRACT: Social norms have a pervasive effect on people's lives, affecting 
their dress, travel, recreation, work, and participation in the society at 
large. Other social norms make constraints on the behavior of commercial 
companies and other institutions, affecting the way they do business, their 
treatment of employees, and their treatment of the environment. As 
societies grow and become more diverse, the system of norms becomes 
correspondingly complex. An attempt is made to bring computational 
assistance to bear in managing and interacting with these normative 
systems. A prototype expert system is described that utilizes deontic rules 
for reasoning about normative constraints in organizations and other social 
systems. Applications to bureaucracies and electronic contracting systems 
are discussed. DX can be used to model regulations and policies in 
organizations, such as library regulations of universities, and resource 
access policies. Also, regulatory aspects in interorganizational systems 
can be effectively modeled and managed by DX. 

DESCRIPTORS: Expert systems; Systems development; Models; Social life & 

customs; Cognition & reasoning; Ethics; Applications 
CLASSIFICATION CODES: 5240 (CN=Software & systems); 2410 (CN=Social 

responsibilities); 1200 (CN=Social policy); 9130 
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...TEXT: will present a prototype deontic expert system, called DX, and 
demonstrate its operation and applicability. In later sections we discuss 
extensions to this basic model and its potential application to large-scale 
normative systems. 

2... choices of state transformations are done via the backtracking 
techniques [3, 24] . 

The computing process in this section is obtained by program clause logic 
programming of selected deontic logic axioms and theorems. It can be... is 
either a constant, variable, or a function. These terms are regarded as 
part of the open vocabulary as predicates for conditions are. The 
combined syntax for the DX rules is shown in figure 1. 

Note that. . . 

. . . components, interpreter and dialog, and provides data structures for 
rules and facts as defined in the previous section . The interpreter 
consists of a knowledge control strategy, which is based on the resolution 
of deontic logic... to determine the deontic status of a specific action, 
following the computing process for deontic reasoning in section 2.2. 
When an Assess command of an action, say X : DO-SOMETHING, is given, DX tries 
. . .2 theta sub 2 )>. 

Further transformations of the state and refutations of PERMIT { — ) by 
deontic reasoning in section 2.2 may result in a state: 

sigma = < (), (alpha sub 1 Theta sub 1 , alpha sub 2 ... accessible DX 
commands, through which the users query and update conditions. 

* DX provides deontic deduction capabilities. 

In sections 2 and 3, we focused our attention on the syntactic aspect of 
DX, which includes the procedural... 

. . . completeness of DX reasoning require further investigations; we will 
leave these as a future research issue. 

In section 5, we proposed two directions for extensions to DX: defeasible 
reasoning and temporal reasoning extensions. Another research... 
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The ONLINE 100: ONLINE Magazine 1 s Field Guide to the 100 Most Important 
Online Databases 
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DOC TYPE: Journal article LANGUAGE: English LENGTH: 2 Pages 
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ABSTRACT: The book The ONLINE 100: ONLINE Magazine f s Field Guide to the 100 
Most Important Online Databases by Mick O'Leary is reviewed. 

GEOGRAPHIC NAMES: US 

DESCRIPTORS: Online data bases; Ratings & rankings; Book reviews 
CLASSIFICATION CODES: 9190 (CN=United States); 8302 (CN=Software and 
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...TEXT: much less painful with his collection of the best 100 databases. 
The book is a directory of various types of databases available in the 
online world. Each database profile contains a brief description of the 
database, a "Content Notes" section , which summarizes the content of the 
database, a "Search Notes" section , which gives tips on effective 

searching , a section called "Do Not Use For, " which notes the 
limitations of the database, and the "Key Facts" section , which lists the 
time span of the database, the producer, which systems carry it, where to 
find. . . 
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DataTimes 1 big move 
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ABSTRACT: The first generation online services, including pioneers like 
DIALOG, NEXIS, Dow Jones News/Retrieval, and DataTimes are scrambling to 
catch up with the 2nd generation, user friendly online services. No other 
service has tried to come further faster than DataTimes. With its new EyeQ 
service, DataTimes has taken a chance on a vastly ambitious and expensive 
makeover. The company has rebuilt from the ground up - new data, alliances, 
software, interface, and pricing. This bootstrap effort is generally 
successful and redefines DataTimes as a 2nd-generation online service. 
DataTimes has quickly assembled a comprehensive set of business information 
sources , but there are a half-dozen online services with comparable 
information. The EyeQ software is effective and attractive, but these 
features are taken for granted by today's computer-sawy information user. 
DataTime's output-based costs put them at the forefront of pricing trends, 
but others are catching up. DataTimes will have to market and promote 
against names like Dow Jones and NEXIS, which are far better known. Thus 
much hard work remains for DataTimes. 

COMPANY NAMES: , ... 

DataTimes 

GEOGRAPHIC NAMES: US 
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CLASSIFICATION CODES: 9190 (CN=United States); 8302 (CN=Software and 
computer services); 5250 (CN=Telecommunications systems); 9120 
(CN=Product specific) 

...TEXT: diverse set of databases. The screens are bright, attractive, and 
uncluttered. In both novice and command modes, search steps are 
efficiently and logically presented. Documentation in the Windows Help 
section is clear and thorough. EyeQ's major weakness is in the 
arrangement of databases . Several preformatted groupings are provided, 
but it is not easy to tell what sources are in what category. . . 
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SilverPlatter CD-ROM discs 

Ashworth, Wilfred 

New Library World v96nll21 PP: 37 1995 ISSN: 0307-4803 JRNL CODE: NLW 
DOC TYPE: Journal article LANGUAGE: English LENGTH: 1 Pages 
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ABSTRACT: The SilverPlatter Directory of CD-ROM Discs is almost a 1/4 of an 
inch thick and runs to 92 pages listing more than 200 databases. Many of 
these databases are available from other suppliers but differ in the layout 
and in the search software which accompanies them. SilverPlatter 1 s search 
method has been designed to work with all their discs - a distinct 
advantage because it has to be learned only once and the owner of several 
databases does not have to install special software for each which would 
take up. valuable space, on a hard disk Currently the search software comes 
on a separate CD-ROM which will install either SPIRS for DOS or WINSPIRS. 
It also carries a substantial demonstration database of medical references 
on AIDS, the software to load MACSPIRS for the Macintosh, and an electronic 
copy of the full SilverPlatter Directory. Other SilverPlatter databases are 
also discussed. 
COMPANY NAMES: 

SilverPlatter Information Inc 
GEOGRAPHIC NAMES: UK 

DESCRIPTORS: Data bases; CD-ROM; Applications; Software reviews; 
Directories 

CLASSIFICATION CODES: 5240 (CN=Software & systems); 9120 (CN=Product 
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. . .TEXT: is almost a quarter of an inch thick and runs to 92 pages listing 
more than 200 databases . Many of these databases are available from 
other suppliers but differ in the layout and in the search software which 
accompanies . . . 

... their, discs — a distinct advantage because it has to be learned only once 
and the owner of several databases does not have to install special 
software for each which would take up valuable space on hard disk. 
Currently the search software comes on a separate CD-ROM which will 
install either SPIRS for DOS, or WINSPIRS (the Windows version) . It also 
carries . . . 

. . . edition of a textual database is one which can be confidently 
recommended for ease of use and effective searching - 

Nursing and Allied Health (CINAHL) is a comprehensive database of citations 
to nursing and health literature, 1983... 
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ABSTRACT: The total cost management (TCM) concept is exciting if not new. 
Prevention of unwarranted expenditures of time and dollars has been central 
to the total quality concept. In the US, total quality management is slowly 
finding its way into the mainstream. Cost engineering is still struggling 
to achieve recognition. There are things that members of the cost 
engineering profession can do to hasten the acceptance of management 
control into industry. First, cost engineers need to recognize some of the 
deficiencies that developed over time and resolve them. Next, they must 
recognize that users of cost engineering tools encompass the entire project 
organization and that each can use the service of cost engineers. Cost 
engineers must be responsible for presenting the facts quickly, succinctly, 
and accurately. They cannot assume that everyone understands the concepts 
they use on a routine basis. It is the responsibility of the cost engineer 
to make the organization aware of TCM through a representative training 
program. 

GEOGRAPHIC NAMES: US 

DESCRIPTORS: Cost engineering; Total quality; Project management; 

Professional development; Recommendations 
CLASSIFICATION CODES: 9190 (CN=United States); 3100 (CN=Capital & debt 
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...TEXT: of infrastructure projects, and downscoping of future work is 
headline fodder for the front page and business sections of local 
newspapers. The concept of Total Cost Management (TCM) is a cry in the dark 
when . . . 

. . . are written by an MIS department or outsourced through a reputable 
programmer. Think for a moment how many spreadsheets, databases , 
graphics packages, and even program management packages have been developed 
by frustrated cost engineers or adventurous MIS... can adjust your grouping 
of the audience accordingly. The executive information system model, 
discussed in the next section , will help you to further analyze your 
audience and to develop the type of analysis they need... will get immediate 
attention and recognition from the reader. The following war story should 
help focus the importance of this point. 

As a cost engineer, I have predicated my work on its being destined to 
drive project management to remain on course of take corrective ... this for 
thirty years" and don't need a schedule to manage the job. 

To summarize this section , all of the good analysis in the world isn't 
going to save your firm a penny. . . 

...topics and technical engineering topics. One meeting might be devoted to 
cost forecasting and the next on segmental bridge construction. Both 
groups learn from each other and both tend to work more closely. The 
training. . . 

. . . work may well be the most important since it opens the eyes of the team 
to a segment of a project that gets little vocational attention and, 
therefore, is viewed with caution, if not disdain... 
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ABSTRACT: Given the risks associated with distributed databases, chief 
information officers first need to understand important basic concepts 
associated with distributed database technology. Distributing data without 
also reengineering application software will increase rather than decrease 
the aggregate processing capacity needed to support a given application 
and, at the same time, increase the complexity and cost of maintaining that 
application. In developing a distributed database, it is particularly 
important to understand the interaction between processes and data - 
specifically, which application processes interact with which data entities 
and at what frequencies. Distributing a database does not represent an 
appropriate response to growing application workload and performance 
demands. Increasing the capacity of a single processing unit or adding 
additional processors to a symmetrical processing environment would be a 
more effective and scalable solution. 

GEOGRAPHIC NAMES: . US . . ..... 

DESCRIPTORS: Distributed processing ; Data bases; Technological planning; 

Systems development; Requirements 
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...TEXT: basic concepts. 

WHAT IS A DISTRIBUTED DATABASE? 

Distributed database is a catch-all term used to describe several types 
of database processing capabilities — specifically, remote request, remote 
unit of work, distributed unit of work, and distributed request. Of... 

. . . database processing, only the distributed unit of work and distributed 
request support transactions in which data are split across two or more 
physical databases. This is what people usually think of when they hear 
"distributed. . . 

. . . per . transaction. Some basic definitions follow. For consistency, the 
term client is used to describe any application function that requests 
services (e.g., create, read, update, delete) from a database. 

REMOTE REQUEST. A remote request allows a... 
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ABSTRACT: The chances of actually retrieving the desired information from a 
CD-ROM depend significantly on the retrieval engine supplied with the 
product. The proliferation of software and the increasing availability of 
single databases under several different access programs make software 
evaluation an important component in the overall CD-ROM assessment process. 
When evaluating a CD-ROM database product for purchase, the buyer must 
evaluate 3 components: 1. the database, 2. the interface, and 3. the 
retrieval engine. The' most important evaluation criteria are users, 
requirements, and constraints. Other evaluation criteria for access 
software can be divided into 5 broad categories: 1. hardware and software 
dependencies, 2. interface features, 3. search and retrieval functions 
, 4. output functions, and 5. general production features. A checklist is 
provided that outlines the general evaluation criteria to be considered. 
The evaluation process will become increasingly important as new software 
and new products proliferate. 

DESCRIPTORS: CD-ROM; Data bases; Information retrieval; CAR; User interface 

; Software packages; Functions; Evaluation 
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...ABSTRACT: the retrieval engine supplied with the product. The 
proliferation of software and the increasing availability of single 
databases under several different access programs make software 
evaluation an important component in the overall CD-ROM assessment process. 
When . . . 

... most important evaluation criteria" are ' users,' requirements, and 
constraints. Other evaluation criteria for access software can be divided 
into 5 broad categories: 1. hardware and software dependencies, 2. 
interface features, 3. search and retrieval functions , 4. output 
functions, and 5. general production features. A checklist is provided that 
outlines the general evaluation... 
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SECTION HEADING: News ' .... 

WORD COUNT: 738 

TEXT: 

Gupta Technologies Inc. is expected to announce next week imminent 
availability of SQLBase Server 5.0, which will be offered as a NetWare 
Loadable Module for Novell Inc. 's NetWare network operating system. 

... features discussed in August, which the CTA's Saks said apparently 

have been implemented, included support for databases partitioned 
across multiple disk drives or servers, faster and more efficient 
database queries , and maintenance of data integrity during accesses and 
manipulations by multiple users. 

All versions of SQLBase Server... 



30/5, K/15 (Item 1 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2003 The Gale Group. All rts. reserv. 



02434310 SUPPLIER NUMBER: 65140851 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Really unique? (Technology Tutorial) 

Glassborow, Francis 
EXE, 15, 2, 33 
July, 2000 

ISSN: 0268-6872 LANGUAGE: English RECORD TYPE: Fulltext 
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for the Dinkumware implementation of the Standard Library 
explicitly states that the results are undefined if the predicate does 
not provide an equivalence relationship. The significance of this is that 
the owner of Dinkumware is P.J. Plauger; he drafted much of the library 
section of the C++ Standard. 

There are different ways of implementing unique for equivalence 
relationships. Two elements being... 
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Winning the client-server game. (ODBC) 
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ISSN: 1061-3501 LANGUAGE: English RECORD TYPE: Abstract 

ABSTRACT: Developing a client/server application with Microsoft's Visual 
Basic requires the developer to understand a large body of rules. Knowing 
when and how to use ODBC is a particularly complex issue. Moving data 
between a VB application and a server database require the use of ODBC. 
Before ODBC, developers had to learn different call libraries for each 
database, but ODBC uses a common set of calls to minimize the learning 
curve. Using the Microsoft Jet engine provides a common way to use ODBC. 
Jet includes a full SQL engine, and can parse SQL statements and 
optimize queries . Jet is tuned to interface with server data, and 
permits both forward and backward motion without having to manage multiple 

database connections. Jet's Data Object layer provides a single access 
method, and makes databases available to custom controls. A Data Object 
manipulates the database, without the program having to.be aware of which 
database is being used. 

SPECIAL FEATURES: illustration; table; chart 
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. . .ABSTRACT: Jet engine provides a common way to use ODBC. Jet includes a 
full SQL engine, and can parse SQL statements and optimize queries . 
Jet is tuned to interface with server data, and permits both forward and 
backward motion without having to manage multiple database connections. 
Jet's Data Object layer provides a single access method, and makes 
databases available to custom. . . 
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Complex databases can speed response via parallel hardware, software 
design. 
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ABSTRACT: Databases are increasingly being developed to speed responses to 
user queries via parallel processing design. The move necessitates that IS 
managers familiarize themselves with the parallel hardware on which the 
databases run in addition to understanding parallel database architecture. 
Implementing parallel structures in both software and hardware reduces the 
chance of bottlenecks; queries can be split into smaller tasks and executed 
on different processors, and response time is improved by partitioning data 
over a number of disks. Parallel hardware, that is those architectures that 
are either tightly or loosely coupled, are possible under symmetric 
multiprocessing (SMP) and massively parallel processing (MPP) , 
respectively. Key advantages of SMP include its suitability for shared 
memory and its operating system-sharing capabilities. MPP machines scale 
better than SMP machines, although SMP systems are improving in this area. 
Data partitioning and query optimization can be combined to create a 
parallel partitioning architecture. 

SPECIAL FEATURES: illustration; chart 
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... already added them to their products. The databases simply 

coordinate multiple load or dump tasks running on separate processors. 

Optimizing queries for quicker response is more difficult and is 
probably the main area on which vendors of parallel... 
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... into managing more diverse objects, including video, animation and 

even sound, other viewers will also be necessary. * Multiple database 
access. Xyvision has gone the extra mile to accommodate user requests for 
additional functionality thus far, and we would expect it to continue in 
this manner. The new Windows client, undoubtedly... 

...of the software. Another user-oriented refinement that we think deserves 
attention is the ability to access multiple databases (currently there 
can be only one) using the same sql sequences and interface. Many sites 
will have ... . .... 
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ABSTRACT: A database system is generally considered successful if is 
capable of fast and and accurate search and retrieval of desired records. 
However, such precision and speed is difficult and costly to achieve where 
data is dispersed among many relational databases located throughout a 
network. This is especially so if data is unstructured. For a distributed 
relational database system where relations are commonly divided up into 
fragments, there are several recommended methods, to efficiently act on 
queries . Among them are the identification of local processing 
opportunities, adoption of a f ragment-and-replicate strategy, use of 
partition -and-replicate technique and hashed partitioning . For 
heterogeneous multidabases , there are several factors to consider such as 
the front end, schema integration and query translation. 

SPECIAL FEATURES: illustration; program; chart 
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...ABSTRACT: records. However, such precision and speed is difficult and 
costly to achieve where data is dispersed among many relational 
databases located throughout a network. This is especially so if data is 
unstructured. For a distributed relational database system where relations 
are commonly divided up into fragments, there are several recommended 
methods to efficiently act on queries . Among them are the 
identification of local processing opportunities, adoption of a 
f ragment-and-replicate strategy, use of partition -and-replicate technique 
and hashed partitioning . For heterogeneous multidabases, there are 
several factors to consider such as the front end, schema integration and 
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ABSTRACT: The UK's Computing Services Industry Council (COSIT) is 
attempting to develop a set of national standards for project management. 
Information technology projects often fail to meet deadlines and budget 
limits. COSIT Dir Gordon Ewan believes that mismanagement is the cause of 
many of the failures. About 20 companies and government agencies are 
participating in the project management research program. Each organization 
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project and conduct self appraisals, and provides each staff member with a 
career log itemizing standards for and developing levels of competence, The 
main goals of the research are to determine what skills and attributes are 
important to successful project management, create a project skills 
dictionary predicated on a software development cycle, and develop a 
project management framework. 
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...ABSTRACT: are to determine what skills and attributes are important to 
successful project management, create a project skills dictionary 
predicated on a software development cycle, and develop a project 
management framework. 

... a series of working parties to improve the standard of project 

management in the UK. Research is divided into three main areas: the 
first will define. the attributes and skills needed to make a project... 
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... are they? The answer is query speed and distributed database . Well, 

as the entire world knows distributed database means many things to 
many people, and most of them don't work. However on the query front, IBM 

...query. It has done this by using multiple index access paths or 
multi-index searching . Imagine a query that wants to explore three 



separate fields, dictating a maximum or minumum value to each and slim 
down the records just to those that comply. It is the sort of query 
function that relational databases seemed to be invented for, for instance 
"Find me all the employees that have. . . 
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The harmonization race ma\ be running smoothly when it comes to the use of 
standards, but it could be\a rough road for manufacturers on the labeling 
front. For one thing, the ra\ce hasn't even started. AThe EU [European 
Union] and U.S. are not harmonizing on- the labeling issue. They went one 
way and we're going our own. Nb. one has pu/ the discussion up on the 
table, @ Center for Devices Director Dr. kruce Burlington, M.D., said in a 
March 26 interview. FDA is, howeve*x, looking into how the Office of Device 
Evaluation's (ODE) controversial Hi^gh^ights of Labeling Information (HLI) 
proposal will fit into the international arena, said Dr. Daniel Spyker, 
M.D., deputy director of the Divis/on^f Cardiovascular, Respiratory and 
Neurological Devices. HLI, originally oalled AEssential Prescribing 
Information,© grew out of an initiative >of the drugs and biologies centers 
to provide health care practitioners witny the essentials of a drug or 
device. FDA issued a draft proposal in April 1997 and had a September 
public meeting on the issue/ After being bombarded with complaints from 
industry, CDRH scaled back/its proposal andV/ill issue a new draft. As a 
result, HLI could go one a£ two ways still. Spyker said it could be a one- 
page summary of the most/important labeling information or it could simply 
be highlights within thd labeling document itself. For example, 
manufacturers could pu£l out sections into the reft margin for greater 
emphasis, he explained There will be a section chealing with in-ternational 
issues when the agen/y issues the next HLI guidanc\ document, Spyker said. 
The final guidance \vas not issued last month, as expected, and Spyker had 
no idea when it woufld be issued. The Health Industry^4anuf acturers Assn. 
(HIMA) has opposea HLI, partly due to major modif icatrons in international 
labeling already /under way. AThere are a whole lot of labeling changes 
going on in the EU right now for the CE Mark. ' To have t\p do that plus 
massive labeling changes here in the U.S. doesn't sound^ike a good idea to 
us,@ said Dr. Marlene Tandy, M.D., director of technology and regulatory 
affairs and associate general counsel at HIMA. This June, the Medical 
Device Directive becomes fully transposed in all EU nations, meaning 
manufacturers of certain classes of devices will need a CE mark from a 
notified body in order to sell their products in Western Europe. Tandy 
added that the trade association has not examined EU and U.S. labeling 
requirements in depth and can not determine if there will be any major 
differences that could cause serious concern for industry. However, at 
repeated conferences, regulatory affairs representatives have groaned about 
how Europe's hodgepodge* of national labeling requirements has created - 
hassles for American firms in getting products on the market — some 
bordering on trade barriers (See September issue, page 8) . 

Technically, HLI is a guidance document that will not be enforced by 
FDA, but Tandy disagreed. AIn practice, reviewers will say, 'Where's your 
HLI?' FDA claims that it is a guidance, but reviewers don't use it that 
way; we haven't found a solution for this,@ she argued. According to the 
association's comments on the concept: AHIMA has received word that at 
least one reviewer in ODE has already begun to use this draft labeling 
guidance as a requirement in discussions with a device manufacturer during 
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... as the two predicates. In fact, tsiie only difference mentioned in 

the 510 (k) was magnet field strength ,yTOch went from 0.5 Tesla (T) in 
the predicate to LOT in the T10-NT ./^ccc\ding to the submission, it Ais 
a 1.0 Tesla... / \ 

...system is indica-ted for use afs diagnostic device that produces 
transverse, sagittal, coronal and oblique cross- Sectional images of the 
internal structure of the hiimsm head, body or extremities . @ The firm added 
that image. . .was adequate ania that no further information was needed. 
However, Philips agreed to modify the acoustic noise faction of the 
Operators Manual to include uhe^ciat a FDA had wanted. In that subsection, 
the firm recommended. . . 
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At IBM UK's DB2 briefing last week, the company made it clear that there 
were plenty of improvements in the DB2 release 2.2 that ships this month - 
even if a big improvement in raw transaction throughput was not one of 
them. So what are they? The answer is query speed and distributed 
database .Well, as the entire world knows distributed database means many 

things to many people, and most of them don't work. However on the query 
front, IBM seems to have come up trumps, with a 10-fold overall improvement 
and a fivefold reduction in processor time needed to handle a query. It has 
done this by using multiple index access paths or multi-index 
searching . Imagine ' a query that wants to explore three' separate fields, 
dictating a maximum or minumum value to each and slim down the records just 
to those that comply. It is the sort of query function that relational 
databases seemed to be invented for, for instance "Find me all the 
employees that have been here more than 10 years, that have sales 
experience and that earn less than GBP25, 000 a year". 
Multiple index 

Perhaps the criteria that John Akers himself looks for when fleshing 
out the IBM sales team these days. Any such query would do. What single 
index searching does is exactly what the question asks, it finds the data 
pages with the length of service over 10 years and asks what next. Multiple 
index searching first looks at the whole question, and sets up three 
separate jobs, looking for length of service, at all the records of 
historical job titles and down the salary column as well. Instead of 
reading in those data pages it just loads the pointers to each set of 
qualifying records, then compares them. IBM is no longer allergic to the 
word "pointer, 11 when used in conjunction with relational databases, which 
it was when Codd was at the helm, but it prefers to call them Row 
Identifiers or Rids. By comparing them, it can reduce the Rids to the final 
answer before ever looking at a data page. That cuts down a lot of 
input-output as well as' a lot of work. It isn't the fact that the computer 
has three eyes looking down the three columns at once that speeds the 
process up, because it still does it one job at a time. But using an 
intelligent pre-fetch instruction (an asynchronous fetch that invokes an 



'input-output without tying up a processor waiting for the disk to respond) 
it at least has the three disk locations being addressed in parallel, even 
if the final processing can't be done that way. That's where the statement 
of direction on multiprocessors comes in - IBM should be able to speed up 
those equiries further once it can work one enquiry across more than one 
processor. IBM got the 10-fold increase by searching a 10m row table 
looking for two predicates connected by the "AND" operator, where the 
result was 9,851 rows that qualified. Another contributor to the improved 
performance has been the issue of query optimisation, and here IBM was 
handing out some clues to the future. All of its relational database 
products have some form of optimiser which looks at data about the files 
sizes, types and structure (isn't this part of what we used to call 
meta-data) and decides upon the most sensible way to answer the SQL query. 
IBM calls these data distribution statistics and the additional ones it is 
keeping include the 10 most frequently used multi-index searches and 
non-unique indexes. These can be adjusted by the database administrator at 
the site and invoked or not using the DB2 Runstats utility. Will all of 
this type of meta-data become the province of the Repository Manager? Yes, 
but only in the fullness of time. These improved optimisation algorithms 
have created a headache for IBM and others for as long as anyone can 
remember. Chris Date tells an anecdote about trying to design an optimiser 
that will work across an entire distributed database. The same anecdote, 
slightly updated, is still doing the rounds of IBM presentations and like 
all good jokes it weakens with the number of people who have told it 
before. IBM's current joke about this is that you can have two chunks of 
data, one on an MVS mainframe, another on an OS/2 machine. The MVS system 
has 10m rows and the PS/2 has just 100, and an SQL query needs a relational 
join done on the combined databases . "Well the first thing that it tried to 
do was download all 10m records from the remote mainframe to the OS/2 
machine." Some canned laughter. Date's version was simply that "We turned 
the algorithm off and asked the question with the variables in three 
different orders. One time it took three seconds, another time it took 
three hours and they're still waiting for the third one to finish." You had 
to be there, but however you cut it, it is still the same joke, and the 
same problem. The solution however is to have each of the local optimisers 
in constant touch with each other, and make sure that all SQL queries are 
fully compiled first and not taken on the fly, first search term first. It 
is easy to establish the last part, but optimisers that talk reliably and 
which take account of each other's statistics are another matter, 
especially where the decision to feed or not to feed all the available 
statistics to the optimiser is made at database administrator level. Human 
error will definitely find a way of creeping through into a seriously 
complex network. Apart from these improvements in performance what can IBM 
deliver that's new this month? 
Distributed database 

Distributed database means many things to many people. To IBM it means 
four things, and roughly it plans to deliver two and a half of these this 
month. The first is to give Systems Application Architecture SQL requests 
remote access across a network to any database; the second, to provide 
transaction integrity between one local database and one other remote 
database, and process SQL requests on either one of them but not both; the 
third is to pre-compile and bind an SQL request, and have it extract data 
from both databases; and the fourth, to have fully distributed requests 
search multiple databases and deliver the answer to one database user 
transparently . IBM reckons that in DB2 Version 2.2 it gets you somewhere 
between two and three, by getting data from two databases at once, but at 
the cost of guaranteed transaction integrity, "two phase commit has to be 
handled in user programming if you're using more than one database right 
now," and without the help of synchronised optimisers. For those of us that 
can't help thinking of distributed database as item four and item four 
alone, what does IBM suggest we do with our data at in the meantime? "Put 
it where it makes most sense," said Starkey firmly. It was impossible to 
believe that IBM meant anything other than "Keep it on the mainframe." 
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. . .are they? The answer is query speed and distributed database . Well, as 
the entire world knows distributed database means many things to many 
people, and most of them don't work. However on the query front, IBM... 

...query. It has done this by using multiple index access paths or 
multi-index searching . Imagine a query that wants to explore three 
separate fields, dictating a maximum or minumum value to each and slim 
down the records just to those that comply. It is the sort of query 
function that relational databases seemed to be invented for, for instance 
"Find me all the employees that have... 
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... knowledge, and (b) the processing organization's knowledge base 

(see Figure 1) . 

An aside to conclude this section . 

The existing software that performs the meaning unfolding 
process — so-called EDI translator software — is widely regarded... a CET and 
a look at the wrapper are required, as will be discussed in the next 
section . 

AN ASIDE ON MAPPING TO THE WORLD 
No semantics can ever by entirely formal. At some point ... here . Notice 
that the bridge laws ((15) and (16)) are completely general and use only 
the controlled vocabulary (verb predicates , thematic role predicates 
(from the controlled vocabulary of lean-event semantics), and predicates 
from the logic of . . . 
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ABSTRACT: Part of a special section on the factors that go into 
shippers' bottom-line decisions to choose a nonvessel operating common 
carrier examines the figuring of the bottom line. Although some larger NVOs 
have taken aim at the full-container market, most still rely primarily on 
LCL business. As every tariff is unique in structure and usage, it is 
difficult to compare carrier and NVO rates. With similar commodities, the 
base rates between the carrier types may show that the NVO can give a 
better deal than the carrier. However, the shipper should note that extra 
handling' charges tacked' onto the base rate may result in drastically 
different rates, with the NVO's rate being no more competitive than the 
carrier's. All rates are predicated on weight or measure, and three 
standards have commonly been used by carriers, each of which is briefly 
discussed. In addition, the way in which smaller shippers can save money on 
LCL shipments through forming shipper associations is examined. 
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...rates, with the NVO's rate being no more competitive than the carrier's. 
All rates are predicated on weight or measure, and three standards have 
commonly been used by carriers, each of which is briefly discussed. . . 
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importance here is that the occurrence of names in articles is 
highly dynamic so the use of dictionaries is not predicated . 

There is clear evidence that where indexing is a requirement, various 
automated techniques can ease the process... 



